I have few CS textbooks with me which discuss languages, well actually 2 plus old course notes supplied a few years ago. I have been searching the web too any only seem to come up with vague responses just like the text books I have.
My question is about language recognisers verses generators.
I get the overriding principle of a recogniser. That it analyses a language and is able to determine nay or yay if a String belongs to a language. This is at least what I have picked up from the books and notes. However, it's much more complex than that is it not? A tokenise and syntax analyser (which I assume to be recognisers) do not just say yes or no, they also ways where don't they...?
However, language generators. No one seems to be very clear about what they are. The typical description I get is For example Sebasta's Concepts of programming languages says 'A language generator is a device that can be used to generate the sentences of a language. We can think of a generator as a push button that produces a sentence of a language every time it is pressed.' Seriously? That's it?? Your kidding right....
I read that Regex is an example of a Generator, then why when people talk of generators to they not talk of the inputs. For example Regex has a target String, and the Regex with defines both the accepted alphabet and it's grammar rules.
Can someone provide for me a clearer distinction of what a recogniser is?
Asked By : Andrew S
Answered By : babou
I assume that, by definition, a language is a set of finite strings.
The first distinction to be made is between operational and denotational concepts.
Denotational definition of a language is a precise specification of the language through various (a priori) non computational mathematical devices such as, for example:
a specification with a property of strings that characterise the strings of the language. $\{ x^n \mid n$ is a prime integer$\}$
an algebraic characterisation from other languages and strings: $L= (L_1 \cup L_2).abac$ where $L_1$ and $L_2$ are languages already defined. Note that $abac$ may also be viewed as a language with a single string. Regular expressions are an example of such a definition, when they are interpreted as defining regular languages. (Discussion of their pragmatic use for string matching in programming languages, under the name of Regex or Regexp, would require a specific presentation).
a system of language equations, of which the relevant language is a solution with specific properties, for example the smallest solution, if that is well defined. An example of such an equation is $A = A B a$ where $a$ is a symbol, and $A, B$ are variables ranging on languages over an alphabet containing $a$. A pair of languages $(\hat A,\hat B)$ is a solution if $\hat A$ is equal to the concatenation of itself with $\hat B$ and a final symbol $a$
any mix of the above
Then operational concepts give you ways of answering algorithmic questions about the language or performing algorithmic tasks related to the language.
Among these operational concepts, two are classical ones:
if you are given a string, can you decide whether it is in the language or not. Such an algorithm providing a decision procedure is a recognizer.
can you enumerate all the strings in the language. If such an algorithm exists, the language is said to be recursively enumerable. There are many other names. Such an enumeration algorithm is a generator of the language.
A generator provides a semi-decision procedure that can tell you whether a string is in the language, but may not terminate if it is not in the language.
A recogniser can always be used to have a generator (assuming your alphabet is denumerable, which is a general assumption in operational definitions). You just enumerate all the strings on the alphabet, and use the recognizer to determine whether it is in the language.
Both a generator or a recognizer may be used to define a language, operationally.
It is often the case that several "views" or "definitions" of the same language have to be used. It is then essential to prove consistency of the different views, to establish that they do define the same language. Very often, general theorems will provide that. For example, there are theorems that establish that some Context-Free recognizer building techniques (e.g., LR(k)) are consistent with the generator view of the CF grammars.
The same formalism may sometimes be read as fitting any of the above concepts.
For example, a context-free grammar define the language as a the smallest solution to a system of language equations.
It may also be read as a string rewriting system that can generate the language.
And it can be used in a fairly direct way to decide whether a string belongs to the language (without building any specific pushdown machine).
Note that there are many other thing one may want to do with a language, such as associating a structure with strings (e.g., parse-tree, or derivation tree, which need not be the same, depending on the kind of grammar considered). One may also want to associate semantics with the strings. But that is yet another type of problems, which may be very dependent of the kind of language considered.
Note following a remark by user @Vor. It seems that the concept of recognition, and probably of recognizer, is not the same in various sub-communities of the field.
Two articles of Wikipedia seem to have differing views on this:
-
a grammar is usually thought of as a language generator. However, it can also sometimes be used as the basis for a "recognizer"—a function in computing that determines whether a given string belongs to the language or is grammatically incorrect.
Recursively enumerable language
In mathematics, logic and computer science, a formal language is called recursively enumerable (also recognizable, partially decidable, semidecidable or Turing-acceptable) if it is a recursively enumerable subset in the set of all possible words over the alphabet of the language, i.e., if there exists a Turing machine which will enumerate all valid strings of the language.
It is not exactly the same use of the concept of recognition, applied in one case to a string and in the other to the language. Nevertheless, it seems that the terminology is somewhat inconsistent.
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/22288
0 comments:
Post a Comment
Let us know your responses and feedback