I am new to automata, and I have been given a brief introduction to regular expressions only yesterday. I have read the various rules which to define a regular expression. But I am unable to differentiate between regular expressions and grammar of a language( I have not been taught the grammar for regular expressions).
I understand that grammar helps us to generate the valid strings in a language, but then this is what the rules for defining a regular expressions state. So where does the difference lie? I asked my teacher and he said that regex are the most basic strings in a language and grammar is the set of rules forany language, which are of higher order than regex. Can someone provide some more in-depth information?
Asked By : frank
Answered By : Luke Mathieson
Regular expressions, regular grammars and finite automata are simply three different formalisms for the same thing. There are algorithms to convert from any of them to any other.
The basic reason that we have all three is that they were created independently, with the first set of equivalences (there are several other formalisms as well) proven by Kleene (this result, or part thereof is called Kleene's Theorem).
So in that context, depending on which way round you want to run the models, they all recognise or generate strings of a regular language, and mathematically, there is, in that sense, no difference.
Of course sometimes one model is easier to use than another for a particular task, due to the details of the formalism. Furthermore the way they work in a human's head is often a little different, finite automata "feel" like computers, regular expressions "feel" like you're constructing a string out of smaller substrings and regular grammars "feel" like a more traditional grammatical derivation or classification of a sentence in a language (unsurprisingly when you look at the history).
So to compare the two, let's define them:
Regular Expressions
So regular expressions are recursively defined as follows:
- $\emptyset$ is a regular expression
- $\varepsilon$ is a regular expression
- $a$ is a regular expression for every $a \in \Sigma$
- if $A$ and $B$ are regular expressions then
- $A\cdot B$ is a regular expression (concatentation)
- $A \mid B$ is a regular expression (alternation)
- $A^{\ast}$ is a regular expression (Kleene star)
Along with some semantics (i.e. how we interpret the operators to get a string), we get a way of generating strings from a regular language.
Regular Grammars
Regular grammars consist of a four tuple $(N,\Sigma, P, S \in N)$ where $N$ is the set of non-terminals, $\Sigma$ is the set of terminals, $S$ is the start non-terminal and $P$ is the set of productions which tell us how to change the start symbol, step by step, into a string in $\Sigma^{\ast}$. $P$ can have its productions drawn from one of two types (not both though):
Right Linear Grammars
For non-terminals $B$, $C$, terminal $a$ and the empty string $\varepsilon$, all rules are of the form:
- $B \rightarrow a$
- $B \rightarrow aC$
- $B \rightarrow \varepsilon$
Left Linear Grammars
Left linear grammars are the same, but rule #2 is $B \rightarrow Ca$.
Things to Ponder
So looking at these definitions and playing with them, we can see that regular expressions look like matching rules, or ways of dealing with strings a bit at a time.
Grammars seem to "label" sections of the string, and group labels under new labels to validate the string (i.e. if we can get from $S$ to the string, or vice versa, we're happy).
However these are really doing the same fundamental thing, and how you view the metaphor of their function is really up to you.
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/45755
0 comments:
Post a Comment
Let us know your responses and feedback