World's most popular travel blog for travel bloggers.

[Solved]: DFA for a strings whose every subsequence of length five has at least two zeroes

, , No Comments
Problem Detail: 

I have a regular language consisting of such {0,1}^k sequences, in which every subsequence of length 5 has at least two 0's in it (which also means every sequence of length <5 is 'good'). I need to find a DFA which generates this language.

I can't say I tried everything, but each time I encountered the same problem, that is it always generated a subsequence of the form 11111... (I tried incorporating each subsequence of length 5 with two 0's in my DFA so that the last element returned to the beginning of the automaton, but this resulted in accepting 'bad' subsequences - for instance 00111 and 11100 were 'good', but 0011111100 wasn't).

Is there any way to do it?

Asked By : Jules

Answered By : Rick Decker

There is a standard technique for problems like this, where the language can be specified in the form

All words over some alphabet $\Sigma$ where [some condition] is satisfied by all contiguous substrings of a fixed length, $k$.

The key idea here is to define a collection of states $s_i$ where $0\le i \le |\Sigma|^k$. Basically, you are using the states to "remember" the most recently-seen $k$ characters. These states will be part of the finite automaton we'll build for our language.

In what follows, I'll use a simpler language than yours, so that the FA will be sufficiently small to fit here and still be readable. Let $$ L=\{w\in\{0, 1\}^*\mid \text{every length-3 substring of $w$ contains at least 2 zeros}\} $$ We'll define states $s_0, s_1, \dotsc, s_7$ to represent the three most recently-seen characters. It's convenient to do this representation in lexicographic order, so $s_0=\mathtt{000}$, $s_1=\mathtt{001}$, $s_2=\mathtt{010}$, $s_3=\mathtt{011}$, $s_4=\mathtt{100}$, $s_5=\mathtt{101}$, $s_6=\mathtt{110}$, and $s_7=\mathtt{111}$. In other words, $s_N$ will correspond to the 3-bit binary representation of $N$.

Now we'll add some more states to get from the start state to the $s$'s. All it takes is to construct a complete tree (binary in this example) having the $s$ states as leaves, like this: enter image description here

All of the states in the FA above will be accepting states, except for the darkened ones. The $p_i$ states are all accepting, since we haven't yet seen a length-3 string, so we haven't violated the condition that defines the language $L$ and only the patterns $\mathtt{011}, \mathtt{101}$, $\mathtt{110}$, and $\mathtt{111}$ violate the condition of the language. These correspond to states $s_3, s_5, s_6, s_7$. Once we've entered one of these states, we reject the input word, so we might as well merge these into a single "dead" state, $d$, from which there will be no exit.

We're almost done; all that remains is to fill in the transitions among the $s_i$. Because of the way we chose the representations of the states, that's an easy task. If we're in state $s_i$, having just seen characters $b_1b_2b_3$, on input $b$ we'll pass to the state $s_j$, corresponding to the new pattern $b_2b_3b$. If you think about it for a moment, you'll see that $\delta(s_i, 0) = s_j$ where $j\equiv 2i\pmod8$ and $\delta(s_i, 1) = s_j$ where $j\equiv 2i+1\pmod8$. Thus, we'll have the following $s$ transitions (recall that we merged states $s_3, s_5, s_6, s_7$ into a single dead state, $d$): $$\begin{array}{c|cc} & 0 & 1 \\ \hline s_0 & s_0 & s_1 \\ s_1 & s_2 & d \\ s_2 & s_4 & d \\ s_4 & s_0 & s_1 \\ d & d & d \end{array}$$

completing the construction. The nice feature of this technique is that it's almost completely mechanical: the only hard part is making the $s$ transitions. The downside is that it produces a pretty big FA. Even in this simple example, we wound up with a 12-state FA, and if we had needed to look at the five most recent characters, we'd have a FA with 31 states just for the $p$s. In fact, if we apply the standard technique for DFA minimization, we would find that the minimal-state DFA for this example language would require only 7 states, with 6 of them being final.

Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/39960

0 comments:

Post a Comment

Let us know your responses and feedback