World's most popular travel blog for travel bloggers.

How to prove that DFAs from NFAs can have exponential number of states?

, , No Comments
Problem Detail: 

All non-deterministic finite automata can be turned into equivalent deterministic finite automata. However, a deterministic finite automata only allows a single arrow per symbol pointing from a state. Therefore, its states should be members of the power set of states of the NFA. This seems to indicate that the number of states of the DFA could scale exponentially in terms of the number of states of the NFA. However, I was wondering how to actually prove this.

Asked By : John Hoffman

Answered By : Gilles

One operation that transforms an NFA into another NFA but does not do so for a DFA is reversal (point all the arrows the other way round, and swap initial states with accepting states). The language recognized by the transformed automaton is the reversed language $L^R = \{u_{n-1}\ldots u_0 \mid u_0\ldots u_{n-1} \in L\}$.

Thus one idea is to look for a language that has an asymmetric construction. Going forward, this language should be recognized by inspecting the first $n$ symbols, requiring only $n + O(1)$ states. Going backwards, it should be necessary to keep a memory of the last $n$ states, which requires $A^n + O(1)$ states where $A$ is the alphabet size.

We're looking for a language of the form $M_n S M'$ where $M_n$ consists of words of length $n$, $S$ is a nontrivial subset of the alphabet, and $M'$ doesn't provide any further constraint. We might as well pick the simplest alphabet $\mathscr{A} = \{a,b\}$ (a singleton alphabet won't do, you don't get smaller NFAs there) and $M' = \mathscr{A}^*$. A nontrivial $S$ means $S = \{a\}$. As for $M_n$, we require that it does not correlate with $S$ (so that the DFA for the reversed language will need to keep the memory of $S$): take $M_n = \mathscr{A}^n$.

Thus let $L_n = (a|b)^n a (a|b)^*$. It is recognized by a simple DFA with $n+2$ states.

dfa

Reversing it yields an NFA that recognizes $L_n^R = (a|b)^* a(a|b)^n$.

nfa

The minimal DFA that recognizes $L_n^R$ has at least $2^{n+1}$ states. This is because all the words of length $2^{n+1}$ must reach distinct states in the DFA. (In other words, they belong to distinct Myhill-Nerode equivalence classes.) To prove this, take two distinct words $u,v \in \mathscr{A}^{n+1}$ and let $k$ be a position where they differ ($u_k \ne v_k$). Without loss of generality, let's assume $u_k = a$ and $v_k = b$. Then $u b^k \in L_n^R$ and $v b^k \notin L_n^R$ ($b^k$ is a distinguishing extension for $u$ and $v$). If $u$ and $v$ led to the same state in a DFA recognizing $L_n^R$ then so would $u b^k$ and $v b^k$, which is impossible since one leads to an accepting state and the other one doesn't.

Acknowledgement: this example was cited in Wikipedia without explanations. The article gives a reference to an article that I haven't read which gives a tighter bound:
Leiss, Ernst (1981), "Succinct representation of regular languages by Boolean automata", Theoretical Computer Science 13 (3): 323–330, doi:10.1016/S0304-3975(81)80005-9.

Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/6063

0 comments:

Post a Comment

Let us know your responses and feedback