World's most popular travel blog for travel bloggers.

How to define at least one occurrence of a string between two tokens in bottom up LALR(1) parser grammar

, , No Comments
Problem Detail: 

I am trying to define a non terminal symbol in a LALR(1) grammar (with CUP parser). It is requested that

the <code> token must appear exactly twice,  while the <hour> token must appear at least once. 

In the end I came up with this definition:

section     ::= hour_l CODE SC hour_l CODE SC hour_l ; hour_l      ::= /* epsilon */              | hour_l HOUR SC ; 

where SC is a separator (semicolon) between tokens and hour_l is the non terminal symbol for hour's list. This solution has a problem: HOUR can be absent, because epsilon can be reduced from hour_l. Is there a clever solution other than specifying all possibilities or using the semantic capabilities of CUP (ie. putting a counter of how many times HOUR is present in section)? I'd prefer not to use semantics in order to achieve this; in fact, it seems to me this is syntax related.


Asked By : fbrundu

Answered By : fbrundu

My solution, suggested by a friend, is to use a Finite State Machine. I drew a Deterministic Finite Automata, and $C$ is the final state accepted by this machine:


I then transformed it into a right regular grammar:

section     ::= c ; a           ::= CODE SC ; b           ::= a CODE SC ; c           ::= c HOUR SC | b HOUR SC | e CODE SC ; d           ::= HOUR SC | d HOUR SC ; e           ::= e HOUR SC | a HOUR SC | d CODE SC ; 

Hope it helps.

Best Answer from StackOverflow

Question Source :

3200 people like this

 Download Related Notes/Documents


Post a Comment

Let us know your responses and feedback