Problem Definition

For a two-player game scenario consisting of an environment (which can be viewed as the uncontrollable part of the entire scene) and a system (generally the program you want to generate), we abstract it as an 8-tuple:

$\mathcal{X}$: Set of variables controlled by the environment
$\mathcal{Y}$: Set of variables controlled by the system
$\mathcal{V} = \mathcal{X} \cup \mathcal{Y}$
$\theta_e$: Initial assertion about the environment, composed of variables in $\mathcal{X}$
$\theta_s$: Initial assertion about the system, composed of variables in $\mathcal{Y}$
$\rho_e: (\Sigma_V, \Sigma_X) \rightarrow \text{bool}$: Assertion function for selecting the next action of the environment, where the first parameter inputs the current values of all (environment and system) variables, the second parameter inputs the possible actions of the environment, if $\rho_e(s, x) = \text{true}$, it means that when all current variables have value $s$, the environment can take action $x$
$\rho_s: (\Sigma_V, \Sigma_X, \Sigma_Y) \rightarrow \text{bool}$: Assertion function for selecting the next action of the system, where the first parameter inputs the current values of all (environment and system) variables, the second parameter inputs the action taken by the environment at this time, the third parameter inputs the possible actions of the system, if $\rho_s(s, x, y) = \text{true}$, it means that when all current variables have value $s$ and the environment takes action $x$, the system can take action $y$
$\varphi$: An LTL formula defining the winning condition (for the system)

The goal of the algorithm is to find a function $f: (M, \Sigma_V, \Sigma_X) \rightarrow (M, \Sigma_X)$, such that:

If $\rho_e(s,s_X) = \text{true}$, then $\rho_s(s,s_X,\text{snd}(f(m,s,s_X))) = \text{true}$,
Starting from some state $s$, all matches (state sequences) that conform to $f$ are system victories ($(s_0, s_1, \cdots) \models \varphi$).

Here, $M$ is a memory space containing the initial value $m_0$.

This paper solves the Generalized Reactivity(1) game, where $\varphi$ takes the form: $$ \varphi = \bigwedge_{i=1}^{m} \mathbf{G}(\mathbf{F}(J_i^e)) \rightarrow \bigwedge_{j=1}^{n} \mathbf{G}(\mathbf{F}(J_j^s)) $$ where $J_i^e$ and $J_j^s$ are boolean functions representing the liveness conditions that the environment and system must guarantee/achieve, respectively.

Algorithm

Required Concepts

μ-calculus

In theoretical computer science, the modal μ-calculus (Lμ, sometimes just μ-calculus, although this can have a more general meaning) is an extension of propositional modal logic (with many modalities) by adding the least fixed point operator μ and the greatest fixed point operator ν, thus a fixed-point logic. - Modal μ-calculus, Wikipedia

Basically, it just adds fixed-point operators $\mu$ and $\nu$ and extended operators $[-]$ and $\langle - \rangle$ to first-order propositional logic.

The syntax is as follows:

$$ \varphi ::= v \mid \neg \varphi \mid X \mid \varphi \vee \varphi \mid \varphi \wedge \varphi \mid [\varphi] \mid \langle \varphi \rangle \mid \mu X \varphi \mid \nu X \varphi $$

A formula $\phi$ is interpreted as the set of states $Σ$ in $G$ where $\phi$ holds, denoted as $[[\phi]]_G^E$, where $G$ is the game structure, and $E: \text{Map}\langle\text{Var}, 2 ^ Σ\rangle$ maps each relation variable¹ to a set of states.

The (denotational) semantics of the new introduced operators are as follows:

$[[X]]_G^E = E[X]$
$[[ [\varphi] ]]_G^E$ represents the set of states where the system can force $[[\varphi]]_G^E$ to hold.
$[[ \langle \varphi \rangle ]]_G^E$ represents the set of states where the environment can force $[[\varphi]]_G^E$ to hold.
$ [[ \mu X \varphi ]]_G^E $ represents the set of states that can make $\varphi$ hold in finite steps

$[[ \mu X \varphi ]]_G^E = \bigcup_{i} S_i $ where $S_0 = \emptyset$ and $S_{i+1} = [[ \varphi ]]_G^{E[X \leftarrow S_i]}$
$ [[ \nu X \varphi ]]_G^E $ represents the set of states that can make $\varphi$ hold continuously from now on.

$[[ \nu X \varphi ]]_G^E = \bigcap_{i} S_i $ where $S_0 = \Sigma$ and $S_{i+1} = [[ \varphi ]]_G^{E[X \leftarrow S_i]}$

Formula to Solve

The final formula to solve is:

$$ \varphi_{gr} = \nu \begin{bmatrix} Z_1 \\ Z_2 \\ \vdots \\ Z_n \end{bmatrix} \begin{bmatrix} \mu Y (\bigvee_{i=1}^{m} \nu X ((J_1^s \wedge [Z_2]) \vee [Y] \vee (\neg J_i^e \wedge [X]))) \\ \mu Y (\bigvee_{i=1}^{m} \nu X ((J_2^s \wedge [Z_3]) \vee [Y] \vee (\neg J_i^e \wedge [X]))) \\ \vdots \\ \mu Y (\bigvee_{i=1}^{m} \nu X ((J_n^s \wedge [Z_1]) \vee [Y] \vee (\neg J_i^e \wedge [X]))) \end{bmatrix} $$

The outermost $\nu Z$ and internal $Z_{i+1 \text{mod} n}$ indicate that if the system satisfies the current $J_i$, it can move on to try to satisfy the next $J_{i+1}$.

The middle $\mu Y$ ensures that the subsequent $\bigvee_{i=1}^{m} \nu X ((J_1^s \wedge [Z_2]) \vee [Y] \vee (\neg J_i^e \wedge [X]))$ part will terminate in finite steps.

The innermost $\nu X$ and internal $\neg J_i^e \wedge [X]$ indicate that if the environment does not satisfy some $J^e$, the system does not need to push forward.

Solution

The paper provides code for solving out X, Y, and Z using BDD:

BDD calculate_win() {
    BDD Z = TRUE;
    
    // Outer νZ fixpoint iteration
    for (Fix fZ = new Fix(); fZ.advance(Z);) { 
        mem.clear();

        // for each system goal (J^s_j)
        for (int j = 1; j <= sys.numJ(); j++) {
            BDD Y = FALSE;

            // Middle μY fixpoint iteration
            for (Fix fY = new Fix(); fY.advance(Y);) { 
                // Construct starting set: either already satisfy J^s_j and enter next phase Z_{j⊕1}, or already in Y
                BDD start = sys.Ji(j).and(cox(Z)).or(cox(Y));
                Y = FALSE;

                for (int i = 1; i <= env.numJ(); i++) { // Iterate through each environment assumption J^e_i
                    BDD X = Z;

                    // Innermost νX fixpoint iteration
                    for (Fix fX = new Fix(); fX.advance(X);) { 
                        // If environment does not satisfy J^e_i, system can "stay in X"; otherwise must advance through start
                        X = start.or(env.Ji(i).not().and(cox(X)));
                    }

                    // Store X set for j-th goal and i-th assumption
                    mem.addX(j, i, X);
                    // Add all X to Y to build μY
                    Y = Y.or(X);
                }
                // Store Y (representing set of states where goal j can be achieved in finite steps)
                mem.addY(j, Y); 
            }
            // Use latest Y as input for next outer Z iteration (advance νZ)
            Z = Y; 
        }
    }
    return Z;
}

Where cox(x) is the implementation of $[x]$.

Strategy Extraction

void build_symbolic_controller() {
    ctrl = new FDS("symbolic_controller");

    Zn = ctrl.newBDDDomain("Zn", 1, sys.numJ());

    // Combine environment and system transition relations
    BDD tr12 = sys.trans().and(env.trans());

    // When subgoal J^s_j is satisfied, rotate Zn to next goal (j → j ⊕ 1)
    for (int j = 1; j <= sys.numJ(); j++) {
        BDD rho1 = (Zn.eq(j))                 // Current memory state is j (processing j-th goal)
                   .and(Z)                    // Current state is in Z
                   .and(sys.Ji(j))            // Current state satisfies system goal J^s_j
                   .and(tr12)                 // Current transition is legal
                   .and(next(Z))              // Next state must still be in region Z
                   .and(next(Zn).eq((j % sys.numJ()) + 1)); // Next state switches Zn to j⊕1
        // Add this transition to controller's transition relation
        ctrl.disjunctTrans(rho1);
    }

    // Advance from current state mY[j][r] to state closer to goal mY[j][r'], until goal is achieved
    for (int j = 1; j <= sys.numJ(); j++) {
        BDD low = mem.Y(j, 1); // r=1 represents state closest to achieving goal
        for (int r = 2; r <= mem.maxr(j); r++) { // Search from upper layer down
            BDD rho2 = (Zn.eq(j))               // Current goal is j
                        .and(mem.Y(j, r))       // Current distance to goal is r steps
                        .and(low.not())         // Next "layer" cannot be current layer (r′ < r)
                        .and(tr12)              // Legal transition
                        .and(next(low))         // Next state enters lower "layer" (r')
                        .and(next(Zn).eq(j));   // Zn remains unchanged, still in current goal
            low = low.or(mem.Y(j, r)); // Update "known lower layer" set
            ctrl.disjunctTrans(rho2); // Add to controller
        }
    }

    // If environment violates assumption J^e_i, system does not need to advance
    for (int j = 1; j <= sys.numJ(); j++) {
        BDD low = FALSE;
        for (int r = 2; r <= mem.maxr(j); r++) {
            for (int i = 1; i <= env.numJ(); i++) {
                BDD rho3 = (Zn.eq(j))                  // Current subgoal j
                            .and(mem.X(j, r, i))       // Currently in environment assumption i's advance layer X
                            .and(low.not())            // Next layer must descend
                            .and(env.Ji(i).not())      // Environment violates assumption J^e_i
                            .and(tr12)                 // Transition is legal
                            .and(next(mem.X(j, r, i))) // Next state continues in X (i.e., maintain)
                            .and(next(Zn).eq(j));      // Zn remains unchanged
                low = low.or(mem.X(j, r, i));          // Update lower layer state set
                ctrl.disjunctTrans(rho3);              // Add transition
            }
        }
    }
}

Relation variables generally represent a (remaining to be solved) set of states.

Blog