I want to calculate the edit distance (aka Levenshtein-Distance) between two words: «solo» and «oslo».
According to this site we'll get the result matrix:
What I don't understand is: In case of comparison the last «o» from «solo» with the first «o» of «oslo» will see the submatrix:
3 2
4 3
As far as I understand, in order to calculate the bottom right value, which is equal in this example to 3 of this submatrix, we'll get the min(3, 2, 4)
+ (0 if the letters are the same, otherwise 1). So, why in this example the bottom right value is equal to 3 and not to 2?
Asked By : Mike B.
Answered By : Mike B.
If we look deeply at algorithm implementation in C: http://en.m.wikibooks.org/wiki/Algorithm_Implementation/Strings/Levenshtein_distance#C
we'll see the main string which fills the matrix:
matrix[x][y] = MIN3(matrix[x-1][y] + 1, matrix[x][y-1] + 1, matrix[x-1][y-1] + (s1[y-1] == s2[x-1] ? 0 : 1));
as we can see, the value of cells with indexes matrix[x-1][y]
(upper cell) and matrix[x][y-1]
(left cell) in automatically increased, the upper-left cell (by diagonal) with index matrix[x-1][y-1]
is increased by 1 only if the letters are different and only after that we'll take a minimum from these 3 values. In other words:
3 2
4 min(3+0, 2+1, 4+1) = 3
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/28734
0 comments:
Post a Comment
Let us know your responses and feedback