Are there improvements on Dana Angluin's algorithm for learning regular sets

posted by , on 7:40:00 AM , No Comments

Problem Detail:

In her 1987 seminal paper Dana Angluin presents a polynomial time algorithm for learning a DFA from membership queries and theory queries (counterexamples to a proposed DFA).

She shows that if you are trying to learn a minimal DFA with $n$ states, and your largest countexample is of length $m$, then you need to make $O(mn^2)$ membership-queries and at most $n - 1$ theory-queries.

Have there been significant improvements on the number of queries needed to learn a regular set?

References and Related Questions

Dana Angluin (1987) "Learning Regular Sets from Queries and Counterexamples", Infortmation and Computation 75: 87-106
Lower bounds for learning in the membership query and counterexample model

Asked By : Artem Kaznatcheev

Answered By : Artem Kaznatcheev

In his answer on cstheory.SE, Lev Reyzin directed me to Robert Schapire's thesis that improves the bound to $O(n^2 + n\log m)$ membership queries in section 5.4.5. The number of counterexample queries remains unchanged. The algorithm Schapire uses differs in what it does after a counterexample query.

Sketch of the improvement

At the highest level, Schapire forces $(S,E,T)$ from Angluin's algorithm to have the extra condition that for a closed $(S,E,T)$ and each $s_1, s_2 \in S$ if $s_1 \neq s_2$ then $row(s_1) \neq row(s_2)$. This guarantees that $|S| \leq n$ and also makes the consistency property of Angluin's algorithm trivial to satisfy. To ensure this, he has to handle the results of a counterexample differently.

Given a counterexample $z$, Angluin simply added $z$ and all its prefixes to $S$. Schapire does something more subtle by instead adding a single element $e$ to $E$. This new $e$ will make $(S,E,T)$ to be not closed in Angluin's sense and the update to get closure with introduce at least one new string to $S$ while keeping all rows distinct. The condition on $e$ is:

$$\exists s, s' \in S, a \in \Sigma \quad \text{s.t} \quad row(s) = row(s'a) \; \text{and} \; o(\delta(q_0,se)) \neq o(\delta(q_0,s'ae))$$

Where $o$ is the output function, $q_0$ is the initial state, and $\delta$ the update rule of the true 'unknown' DFA. In otherwords, $e$ must serve as a witness to distinguish the future of $s$ from $s'a$.

To figure out this $e$ from $z$ we do a binary search to figure out a substring $r_i$ such that $z = p_ir_i$ and $0 \leq |p_i| = i < |z|$ such that the behavior of our conjectured-machine differs based on one input character. In more detail, we let $s_i$ be the string corresponding to the state reached in our conjectured machine by following $p_i$. We use binary search (this is where the $\log m$ comes from) to find an $k$ such that $o(\delta(q_0,s_kr_k)) \neq o(\delta(q_0,s_{k+1}r_{k+1})$. In other words, $r_{k+1}$ distinguishes two states that our conjectured machines finds equivalent and thus satisfies the condition on $e$, so we add it to $E$.

Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/118

MCA IGNOU GROUP

Are there improvements on Dana Angluin's algorithm for learning regular sets

References and Related Questions

Asked By : Artem Kaznatcheev

Answered By : Artem Kaznatcheev

Sketch of the improvement

Best Answer from StackOverflow

Akshay Mishra

0 comments:

Post a Comment

Search By Course Code

Sponsored Links

Blog Archive

Popular Posts

Label Cloud

adsbygoogle

Ad Blocker

Are there improvements on Dana Angluin's algorithm for learning regular sets

References and Related Questions

Asked By : Artem Kaznatcheev

Answered By : Artem Kaznatcheev

Sketch of the improvement

Best Answer from StackOverflow

Akshay Mishra

0 comments:

Post a Comment

Search By Course Code

Sponsored Links

Blog Archive

Popular Posts

Label Cloud

adsbygoogle

Ad Blocker

Click to Download Now