I am having trouble getting my head wrapped around epsilon transitions while creating an LALR(1) parse table.
Here's a grammar that recognizes any number of 'a' followed by a 'b'. 'S' is an artificial start state. '$' is an artificial 'EOS' token.
0. S -> A $ 1. A -> B b 2. B -> B a 3. B -> epsilon
Itemsets:
i0: S -> . A $ A -> .B b B -> .B a A -> B . b ! because B could -> epsilon B -> B . a ! " i1: S -> A . $ i2: S -> A $ . i3: A -> B . b ! from i0 B -> B . a i4: A -> B b . ! from i0 or i3; the LALR algorithm compresses identical states. i5: B -> B a . ! from i0 or i3: the LALR algorithm compresses identical states.
I previously had a description on how this would work to parse a simple string. I removed it because I know less now than I did before. I can't even figure out a parse tree for 'ab'.
If someone could show me how I have mis-constructed my itemsets and how I'd reduce the epsilon transition I'd be grateful.
Asked By : Tony Ennis
Answered By : rici
Your states and itemsets are not quite correct. The epsilon production must appear in relevant itemsets, and you have combined two states into one, which would produce a shift-reduce conflict if the epsilon production were added to the itemset (which should be done).
The following was generated with bison (using the --report=all
command-line option); it differs from the theoretic model because the grammar has been "augmented" with an extra start symbol and an explicit end-of-input marker ($end
). Also, it has done some table compression, so in the action tables, you can think of $default
as meaning "either a
or b
".
It is worth explaining how State 0 comes about, since it shows how epsilon productions are handled (no differently from other productions).
We start with $accept: . S $end
, by definition. ($accept
is the starting state). Then the closure rule is applied as long as possible. Remember that the closure rule is: If any item in the itemset, the .
is immediately before a non-terminal, add all the productions for that non-terminal with an initial .
. Hence we add:
S: . A
continuing with A
:
A: . B 'b'
continuing with B
:
B: . B 'a' B: .
We can't apply closure any longer, so we're done. Since the state now has an item with the dot at the end (the epsilon production for B), a reduction is possible.
State 0 0 $accept: . S $end 1 S: . A 2 A: . B 'b' 3 B: . B 'a' 4 | . $default reduce using rule 4 (B) S go to state 1 A go to state 2 B go to state 3 State 1 0 $accept: S . $end $end shift, and go to state 4 State 2 1 S: A . $default reduce using rule 1 (S) State 3 2 A: B . 'b' 3 B: B . 'a' 'b' shift, and go to state 5 'a' shift, and go to state 6 State 4 0 $accept: S $end . $default accept State 5 2 A: B 'b' . $default reduce using rule 2 (A) State 6 3 B: B 'a' . $default reduce using rule 3 (B)
In State 0, the closure rule has added the epsilon production (line 4). Furthermore, no item in the state 0 itemset has the point before a terminal. So with any lookahead, the parser is forced to reduce the epsilon production, after which it will use the goto
function for state 0 to decide to move to state 3. (In your state machine, states 0 and 3 are conflated, but I do not believe this is correct.) State 3 will definitely shift a terminal; with the input ab$end
, it will shift the a
and move to state 6, which will then reduce a B
. And so on.
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/33999
0 comments:
Post a Comment
Let us know your responses and feedback