I am reading parser chapter from dragon book and I found a grammar over there -
$ S \rightarrow L=R $
$ S \rightarrow R $
$L \rightarrow *R $
$L \rightarrow id $
$R \rightarrow L $
Now I have created some of its items, not sure if they are correct -
$ I_{0} $
$S' \rightarrow .S ; \$ $
$S \rightarrow .L=R ; \$ $
$S \rightarrow .R ; \$ $
$ L \rightarrow .*R; \$/= $
$ L \rightarrow .id; \$/= $
$ R \rightarrow .L; \$ $
$ I_{1} $
$ S' \rightarrow S.; \$ $
$ I_{2} $
$S \rightarrow L.=R ; \$ $
$ R \rightarrow L.; \$ $
Now I2 contains Shift reduce conflict but the lookahead symbol that is appearing is creating doubt I mean how $ can be lookahead symbol to shift =
sign?
Asked By : codeomnitrix
Answered By : Pseudonym
The grammar is LR(1) and LALR(1). It is not, however, SLR(1). The purpose of this example in the Dragon Book is to show a grammar which is LR(1) but not SLR(1).
The items that you've got so far are completely correct. To understand why state $I_2$ is deterministic, you have to understand what the parser has to do.
The parser has to choose between the following two items:
$S \rightarrow L \cdot = R; \{ \$ \} $
$R \rightarrow L \cdot ; \{ \$ \} $
The first one is a shift of a $=$ symbol which will transfer you to a state with the kernel item $S \rightarrow L = \cdot R; \{ \$ \}$. The second one is a reduce which reduces by the production $R \rightarrow L$.
The lookahead set is what the parser expects to see after it has read the whole production, but the dot indicates what the parser expects to see right now. In the case of the first item, it expects to see $=$, then $R$, then $\$$. When constructing the parser actions, the symbol that you're looking for to do the shift is $=$, because that is the symbol which is shifted.
If the second item's lookahead set contained the symbol $=$, then the shift-reduce conflict would not be resolvable and the grammar would not be LR(1), because it would not be clear which direction to take if the next symbol were $=$.
As it is, the state is deterministic. If the next symbol is $=$, shift it and take the first item. If the next symbol is $\$$, reduce.
(Warning: The rest of of this is a slight digression.)
As well as deriving the machine by hand, I fed the grammar into a parser generator (I used yacc, you could use Bison, ANTLR etc) just to be certain. If you do this, then you should be aware that production-quality parser generators sometimes take a shortcut by not consulting the lookahead symbol on reduction if they don't have to.
In this case, if you found yourself in state $I_2$ and the next symbol was $id$, then you may as well reduce, even though you could have detected a syntax error at this point.
There are three main reasons for this.
First, you don't have to check, since the error will eventually be detected when you try to shift the $id$ symbol. Remember, even the LR(0) machine is correct for this grammar, it's just nondeterministic; what we call a "conflict" in LR-speak we would call "nondeterminism" in other kinds of automata. If the decision is already deterministic, using lookahead information doesn't make it more deterministic.
Second, it's expensive to store and check the lookahead set. Even though computers are big these days, smaller is still better. Saving space is good.
Third, and perhaps most importantly these days, in real parsers, you need to report the syntax error to the user in a way that helps them understand what went wrong. The "bigger" the concept that you can use in the report (e.g. "expression" rather than "lvalue"), the more understandable it is. Reducing before reporting helps with this.
So if you feed this grammar into a production parser generator like ANTLR, Bison, or YACC, you will find that the actions for this state basically say "if you see $=$ then shift and go to state X, otherwise reduce". If you were wondering why they don't consult the lookahead set, that's why.
(Digression ends.)
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/20068
0 comments:
Post a Comment
Let us know your responses and feedback