World's most popular travel blog for travel bloggers.

Finding the grammar type of the programming language

, , No Comments
Problem Detail: 

How can someone find what type of grammar for a given programming language?

Formerly I'm looking for a grammar type for most popular programming languages: C, C++, C#, Java, List, OCaml, Haskell etc.

Asked By : m0nhawk

Answered By : Gilles

Programming languages are typically defined in several stages. A language in formal language theory is a set of strings, but that isn't a very useful concept to explain programming languages. The meaning of a source program is defined by several stages of translation; here are typical stages in the compilation of a translation unit:

  1. Lexical analysis, i.e. splitting the program into tokens. Token recognition is often performed by finite automata, but not always (e.g. languages with nested comments need a stack at this stage).
  2. Parsing. This stage is usually described by a context-free grammar. Although lexical analysis could be described by that same context-free grammar, this would typically result in an extremely complex and slow-to-parse grammar, which is why the two stages are separate.
  3. Variable binding, i.e. associating the different uses of the same name to mean the same variable. In particular, in most languages, variables need to be declared before use. This cannot be described by a context-free grammar.
  4. Typechecking, if the language has a static type system. There is a lot of variability here.
  5. After that point, subsequent compilation stages tend not to have failure cases. There are exceptions, for example some languages regulate recursive definitions at a late stage.

There's no grammar type that describes valid programs in typical programming languages (unless you go for a formalism that can describe all decidable languages or more, such as unrestricted grammars). Many languages can be described by a context-free grammar up to stage 2, but there are exceptions (for example, C has feedback from variable binding to parsing because something like f(); foo * bar; is valid in C89 if foo is a variable but not if foo is a type defined by typedef). When you reach the binding stage, there has been work on general models for binding, but I'm not aware of any name for the kind of languages and transducers that correspond to context-free-grammars-plus-bindings. For typechecking, the sky's the limit: there are languages with Turing-complete type systems, and even the ones that aren't are typically very ad hoc and don't particularly fit any interesting category other than decidable.

Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/7566

0 comments:

Post a Comment

Let us know your responses and feedback