One way of looking at regular expressions is as a constructive proof of the following fact: it's possible to construct the regular languages by starting with a small set of languages and combining them via a small, fixed set of closure properties. Specifically, if we start with the empty language, the language containing the empty string, and the languages of all single-character strings, we can assemble all possible regular languages using union, concatenation, and Kleene star.
Is there a set of base languages and closure properties that can be used to generate all and only the context-free languages? (To clarify: I'm not asking whether you can write regular expressions for all CFLs, which I know is impossible. Instead, I'm wondering whether there is a way to design a regular-expression-like framework for CFLs based on the same basic principles.)
Asked By : templatetypedef
Answered By : Hendrik Jan
That is possible, but perhaps not exactly in the way you ask. As a start take the Dyck language $D_2$ of all strings of matching brackets over two pairs of brackets, say $\{ [,], (,) \}$, or more abstractly $a_1,b_1,a_2,b_2$.
Then every context-free language can be obtained from $D_2$ using homomorphisms, inverse homomorphisms and intersection with regular languages. The context-free languages are called "the principal cone generated by $D_2$", in old books denoted $\mathcal{M}(D_2)$. See a related question: "Which languages are recognized by one-counter machines?"
In fact, we need only one of each operations (well-chosen). Each CFL can be written $g(h^{-1}(D_2) \cap R)$, where $g$, $h$ are homomorphisms, and $R$ is a regular language. Intuitively, $R$ is a program of a PDA, $g$ maps each instruction to the letter read, $h$ to the push and pop operations of the stack. Finally $D_2$ codes proper stack behaviour.
This result is related to the Chomsky–Schützenberger Theorem (or as can be seen in wikipedia, one of them). The statement linked here in wikipedia (a) does not need an inverse homomorphism, while (b) it does not restrict to two pairs of brackets. Theorems of this type are from the area of "Abstract Families of Automata", where Ginsburg and Greibach are important names. A related result by Nivat states that operations of the form $L \mapsto g(h^{-1}(L) \cap R)$ for fixed $g,h,R$ are the finite state transductions.
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/30026
0 comments:
Post a Comment
Let us know your responses and feedback