[Solved]: Can POSIX BRE express all regular languages?

posted by , on 1:48:00 AM , No Comments

Problem Detail:

It appears that "Basic Regular Expressions" as defined by POSIX.1-2008 do not support alternation, a|b (although some grep implementations recognize the escaped version, \|).

Since the regular languages are closed under union by definition, does this mean that POSIX BRE has less expressive power than a finite automaton? Or is there some way to simulate alternation using other constructs?

Asked By : Steve Kobes

Answered By : Gilles

Indeed the POSIX BRE language cannot express all regular expressions because it lacks alternation. It can't even recognize all finite languages, let alone all regular languages.

For example, $\{ab, ba\}$ is not recognizable as a BRE. To prove this, consider what the toplevel syntactic form could be:

It can't be one of the single-character forms since the language has words of length $\gt 1$.
It can't be $R^*$ because that would match the empty string.
It can't be $R^{\{m,n\}}$ except with $m=n=1$ (in which case we're back to the original problem) because that would match strings of different lengths or the empty string.
So it has to be concatenation: $R_1 R_2$. Now consider how $ab$ is recognized:
- If $R_1$ recognizes $ab$ then $R_2$ must not recognize anything anything other than the empty string. So $R_1$ must recognize $\{ab,ba\}$ and we're back to the original problem.
- If $R_1$ recognizes $a$ but not $ab$ then $R_2$ must recognize $b$. But then $R_1R_2$ recognizes all words of the form $u b$ where $R_1$ recognizes $u$, so $R_1$ must not recognize anything other than $a$. There's no way to recognize $ba$.
- If $R_1$ recognizes neither $ab$ nor $a$ then the only way for $R$ to recognize $ab$ is if $R_1$ recognizes the empty string, in which case we're back to the original problem as above, but for $R_2$ this time.

When "we're back to the original problem", that means that the only solution to find a BRE recognizes the language is to find a smaller BRE that has the same property. This is an infinite descent, so there's no BRE that has the desired property.

I don't think there's a "nice" characterization of BRE-recognizable languages, for example as languages recognizable by a "nice" class of automata.

Note that BRE-recognizable languages are actually not a subclass of regular languages, since backreferences add expressive power. For example $\{w w \mid w \in \{a,b\}^*\}$ is recognized by the BRE $.*$\1 but is famously not regular. BRE without backreferences are just syntactic sugar over regular expressions so the languages they can recognize are a subclass of the regular languages.

Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/62944

MCA IGNOU GROUP

[Solved]: Can POSIX BRE express all regular languages?

Asked By : Steve Kobes

Answered By : Gilles

Best Answer from StackOverflow

Akshay Mishra

0 comments:

Post a Comment

Search By Course Code

Blog Archive

Popular Posts

Label Cloud

Ad Blocker

[Solved]: Can POSIX BRE express all regular languages?

Asked By : Steve Kobes

Answered By : Gilles

Best Answer from StackOverflow

Akshay Mishra

0 comments:

Post a Comment

Search By Course Code

Blog Archive

Popular Posts

Label Cloud

Ad Blocker

Click to Download Now