ω-Languages

Let $Σ$ be a set of symbols (not necessarily finite). Following the standard definition from formal language theory, $Σ*$ is the set of all finite words over $Σ$. Every finite word has a length, which is a natural number. Given a word w of length n, w can be viewed as a function from the set $\{0,1,...,n−1\} → Σ$, with the value at i giving the symbol at position i. The infinite words, or ω-words, can likewise be viewed as functions from $\mathbb{N}$ to $Σ$. The set of all infinite words over $Σ$ is denoted $Σ^{\omega}$. The set of all finite and infinite words over $Σ$ is sometimes written $Σ^{\infty}$ or $Σ^{\leq \omega}$. —— Wikipedia

In simple terms, given an alphabet $Σ$, $Σ^{\omega}$ is the set of all infinite strings formed from it, and an ω-language over $Σ$ is a subset of this set.

For comparison: $\Sigma^*$ is all finite-length strings from alphabet $Σ$.

$$ Σ^{\infty} = \Sigma^* \cup \Sigma^{\omega} $$

Operations on ω-Languages

Left Concatenation

$KL$, where K is a language containing only finite-length strings, and L is an ω-language, represents left concatenation of K to L, meaning for each string $k \in K$ and $l \in L$ there is a new string $kl \in KL$.

Note that right concatenation ($LK$) is meaningless, because $L$ is already an ω-language containing infinite-length strings, and adding finite-length strings after infinite-length strings makes no sense.

$^\omega$

$L^ω$ represents the language obtained by "infinitely repeating" all finite strings in language $L$.

ω-Regular Languages

A language is an ω-regular language if it can be written as:

  • $A^ω$, where $A$ is a regular language that doesn't contain the empty string, or:
  • $A\cdot B$, where $A$ is a regular language and B is an ω-regular language
  • $A \cup B$, where both $A$ and $B$ are ω-regular languages, note that $\cup$ can only be performed finitely many times

ω-Regular Expressions

$\cdot$, $|$1 and $*$ have the same meanings in normal regular expressions.

However, in ω-regular expressions, we have a new operator $^{\omega}$, where $E^{\omega}$ means $E$ should be repeated infinitely many times.

For example, an ω-regular expression describing an infinite repeating binary decimal can be written as:

$$ (0|1)(0|1)*'.'(0|1)(0|1)*((0|1)*)^{\omega} $$

where the final $((0|1)*)^{\omega}$ part represents the repeating section.

Büchi Automata2

A deterministic Büchi automaton is a tuple A = (Q,Σ,δ,q0,F) that consists of the following components:
Q is a finite set. The elements of Q are called the states of A.
Σ is a finite set called the alphabet of A.
δ: Q × Σ → Q is a function, called the transition function of A.
q0 is an element of Q, called the initial state of A.
F⊆Q is the acceptance condition. A accepts exactly those runs in which at least one of the infinitely often occurring states is in F.
In a (non-deterministic) Büchi automaton, the transition function δ is replaced with a transition relation Δ that returns a set of states, and the single initial state q0 is replaced by a set I of initial states.
—— Wikipedia

Basically, Büchi automata and finite automata "look the same", with the only difference being:

  • Finite automata must stop in an accepting state after the input is "exhausted" to accept the input string, while Büchi automata require the input string to enter an accepting state "infinitely many times".

Note: The deterministic and non-deterministic versions of Büchi automata are not equivalent. Deterministic Büchi automata are strictly weaker than their non-deterministic counterparts, while non-deterministic Büchi automata can recognize all ω-regular expressions.

Example: $(0|1)*0^\omega$ can be recognized by the following non-deterministic Büchi automaton, but cannot be accepted by a deterministic Büchi automaton.

ω-Regular Expressions $\rightarrow$ Büchi Automata

Inductively, we need to consider 3 cases:

  • Top-level ω-regular expression of the form $A^ω$. In this case, $A$ must be a regular expression, which we can convert to a non-deterministic finite automaton (NFA). Then, for all accepting states of this NFA, add a transition arrow to each successor of A's initial state, with the transition character being the character on the arrow from the initial state to that successor, as shown:

    A-to-A-w

    I've only drawn one accepting state, but there can be multiple accepting states, each following the pattern of $a_0$ in the diagram, connecting back to each successor of the entry.

  • Top-level ω-regular expression of the form $A\cdot B$. In this case, $A$ must be a regular expression which we can convert to an NFA, while $B$ must be an ω-regular expression that can be (recursively) converted to a Büchi automaton using our current algorithm. Then, downgrade A's accepting states to normal states and directly connect them to each successor of B's initial state, as shown:

    A-B AB

  • Top-level ω-regular expression of the form $A | B$. Where both $A$ and $B$ are ω-regular expressions that can be (recursively) converted to Büchi automata using our current algorithm. In this case, simply merge the initial states of $A$ and $B$.

    A|B

    A||B

Example

Provided by the Formal Methods example exam, in our course $|$ is written as $+$.

$$ a(b^ω + cc*ab^ω) $$

The top level is of the form $A\cdot B$, where $A \rightarrow a$, $B \rightarrow b^ω + cc*ab^ω$.

We can first complete the NFA part of $a$.

A->unexp

Continue expanding $b^ω + cc*ab^ω$, where the top level is $+$.

Expand into two parts:

a->unexp12

First consider the $b^\omega$ part, of the form $A^ω$, where $A \rightarrow b$, which can be expanded directly.

a->b*|a->unexp2

The $cc*ab^ω$ part is of the form $AB$, where $A \rightarrow cc*a$, $B \rightarrow b^\omega$.

a->b*|a->cc*a->unexp2

Finally expand $b^\omega$.

full-expanded

And we get the final result.

Büchi Automata $\rightarrow$ ω-Regular Expressions

From the process of converting ω-regular expressions to Büchi automata above, we can see that all ω-regular expressions:

  • May be of the form $A^ω$, corresponding to loops containing accepting states3 in the Büchi automaton graph.
  • May be of the form $AB$, corresponding to:
    • $A$ part: A "path" $q_{start} \to q_{k}$ starting from the initial state $q_{start}$ to accepting state $q_{k}$, but not passing through $q_{k}$.
    • $B$ part: $q_{k}$ is in a loop3.
  • May be of the form $A|B$, corresponding to both $A$ and $B$ having at least one loop containing an accepting state.

So, to convert a Büchi automaton back to an ω-regular expression, we need to:

  • Identify all loops containing accepting states $q_k$ and convert these subgraphs to regular expressions $L_k$.
  • Identify all paths from the initial state $q_{start}$ to $q_k$ (without passing through $q_k$) and convert these subgraphs to regular expressions $R_{0k}$.
  • The desired ω-regular expression is $\Sigma_k^{0\le k<\text{accept state count}}(R_{0k}\cdot L_{k}^{\omega})$.

Example

am-full

For accepting state $q_1$:

  • Its cycle:

    am-full

    Corresponding regex is $b$4.

  • Path from initial state $e$ to $q_1$:

    e-to-q1

    Corresponding regex is: $((bc*a)|a)*c$.

Therefore this branch's ω-regular expression is $((bc*a)|a)*c(b^\omega)$.

For accepting state $q_2$:

  • Its cycle:

    complex-w

    Corresponding regex is: $(c*)|(aa*b)$

  • Path from initial state $e$ to $q_2$:

    e-to-q2

    Corresponding regex is: $a*b$.

Therefore this branch's ω-regular expression is $a*b ((c*)|(aa*b))^\omega$.

Combining both branches, the overall ω-regular expression is $(((bc*a)|a)*c(b^\omega)) | (a*b ((c*)|(aa*b))^\omega)$

References

1

Sometimes written as $+$, $\cup$ or even $\uplus$ - mathematicians/computer scientists really should standardize their notation!

2

I mention Büchi automata here because it was covered in our Formal Methods course, but there are actually many types of automata that can handle ω-languages, see Wikipedia, among which Rabin automata are probably the most interesting.

3

A more accurate term would be "strongly connected component".

4

Could also be $b*$, though $(b*)^\omega \equiv b^\omega$.