**Problem Detail:**

Are there any measures to compute similarity (or distance) between two DFAs?

If yes, which are the main references?

I need a measure of similarity, not only a (binary) equivalence test.

"Similarity" is an intuitive concept. There are different ways to formalize it; i'm looking for ready-to-use formalizations with regard to DFAs (not NFAs, weighted automata, or general graphs!). Furthermore, good solutions may exploit recognized languages.

I found some studies about this problem, which help to clarify the question. They involve two approaches: one from model-testing field aiming at compare languages, a second defined by authors aiming at compare structure of DFAs.

Bogdanov, Kirill, and Neil Walkinshaw. "

" Reverse Engineering, 2009. WCRE'09. 16th Working Conference on. IEEE, 2009.*Computing the structural difference between state-based models.*Walkinshaw, Neil, Kirill Bogdanov, and Ken Johnson. "

" Grammatical Inference: Algorithms and Applications. Springer Berlin Heidelberg, 2008. 252-265.*Evaluation and comparison of inferred regular grammars.*Walkinshaw, Neil, and Kirill Bogdanov. "

." ACM Transactions on Software Engineering and Methodology (TOSEM) 22.2 (2013): 13.*Automated comparison of state-based software models in terms of their language and structure*

They compute exactly what I was looking for, a metrics among DFAs, adopting the techniques developed in the model-testing field.

Does there exist any other approaches, ready-to-use, to compute similarity between DFAs? (Both in terms of language both of structure).

###### Asked By : Gabrer

#### Answered By : DCTLib

This problem seems to ask for co-development of a reasonable *distance function* and an algorithm to compute it for a given pair of languages.

The paper "**The Cost of Traveling between Languages**" by *Michael Benedikt, Gabriele Puppis, and Cristian Riveros* defines one such notion, where we are searching for the largest value $d$ such that there exists a word $w$ in the language of the first automaton such that the edit distance of $w$ to any word in the second language is at least $|w| \cdot d$. The paper gives an algorithm to solve this problem, which involves distance automata.

The paper also has an interesting related work section, with a reference to the paper "**Edit-distance of weighted automata: general definitions and algorithms**" by *Mohri*, which seems to solve the same problem without the multiplication with the word length.

Both of these definitions are reasonable from a theoretical point of view - you now only have to evaluate whether they make sense for your particular application.

Both of the paper have author-archived versions available without pay-wall on the web which you should easily find with your favorite search engine.

###### Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/56211

**3.2K people like this**

## 0 comments:

## Post a Comment

Let us know your responses and feedback