I'm working on a parser for a C-style language, and for that parser I need the regular expression that matches C-style /**/ comments. Now, I've found this expression on the web:
/\*([^\*]*\*+[^\*/])*([^\*]*\*+|[^\*]*\*/
However, as you can see, this is a rather messy expression, and I have no idea whether it actually matches exactly what I want it to match.
Is there a different way of (rigorously) defining regular expressions that are easy to check by hand that they are really correct, and are then convertible ('compilable') to the above regular expression?
Asked By : Alex ten Brink
Answered By : Dave Clarke
I can think of four ways:
Define an automaton for the language you are interested in. Convert the regular expression to an automaton (using Brzozowski's derivatives). Check that both automata accept the same language (determinize and minimize or use a bisimulation argument).
Write loads of test cases and apply your regular expression to them.
Convert the automaton defined in point 1 to a regular expression, using standard techniques.
A combination of the above.
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/311
0 comments:
Post a Comment
Let us know your responses and feedback