[Solved]: Algorithm for length of longest common subsequence

posted by , on 11:16:00 PM , No Comments

Problem Detail:

The case of multiple strings. A slight modification of the dynamic programming algorithm for two strings is used as a subroutine. Here is the pseudo code:

# The modified dynamic programming algorithm for longest common subsequence. # Inputs: x, y - the strings           xL, yL - their respective length vectors lcs(x, xL, y, yL)     u = |x|     v = |y|     T = 2-dimensional table with default value 0 for all elements.     for i in 0..u-1         for j in 0..v-1             if x[i] == y[j]                 T[i, j] = T[i - 1, j - 1] + 1             else                 T[i, j] = max(T[i - 1, j], T[i, j - 1])             # limit the value of the table element by the length vectors:             T[i, j] = min(T[i, j], xL[i], yL[j])     # Take the element wise minimum of the vectors for each of the strings and the bottom row and right-most column respectively.     return x, element-wise-minimum-of(xL, T[_, v - 1]), y, element-wise-minimum-of(yL, T[u - 1, _])  # Algorithm to compute the length of the longest common subsequence of multiple strings. # Inputs: strings - a list of strings mlcs(strings):     if for any string x: |x| == 0         return 0     R = a dictionary indexed by a string x, where the value is initialized to be the length vector of x (initially [1,2,..,|x|]).     N = initially a copy of R.     while true         for i in 0..|strings|             for j in i + 1..|strings|                 x, y = strings[i], strings[j]                 _, xL, _, yL = lcs(x, R[x], y, R[y])                 R[x], R[y] = xL, yL         if for every string x: N[x] is element wise equal to R[x]             break         else             N[x] = R[x]     return the last element of any of the vectors in R

What follows is an implementation in the python language.

import collections  def lcs(x, xL, y, yL):     u = len(x)     v = len(y)     T = collections.defaultdict(int)     for i in range(u):         for j in range(v):             if x[i] == y[j]:                 T[i, j] = T[i - 1, j - 1] + 1             else:                 T[i, j] = max(T[i - 1, j], T[i, j - 1])             T[i, j] = min(T[i, j], xL[i], yL[j])     return (x, [min(xL[i], T[i, v - 1]) for i in range(u)]), (y, [min(yL[j], T[u - 1, j]) for j in range(v)])  def mlcs(strings):     if any(len(x) == 0 for x in strings):         return 0     R = {x: list(range(1, len(x) + 1)) for x in strings}     N = {x: R[x] for x in strings}     while True:         for i in range(len(strings)):             for j in range(i + 1, len(strings)):                 x, y = strings[i], strings[j]                 (_, xL), (_, yL) = lcs(x, R[x], y, R[y])                 R[x] = xL                 R[y] = yL         if all(all(N[x][i] == R[x][i] for i in range(len(x))) for x in strings):             break         else:             for x in strings:                 N[x] = R[x][:]     return R[strings[-1]][-1]

The idea is to initialize for each input string a vector which holds information for each position of that string about what is the maximum length of a LCS up to that position. These are updated each time a pair-wise LCS computation is done. This in turn is done for all pairs until all the vectors are stable (ie. don't change) between iterations.

So far testing gives me mixed results. The correct length at first seems to be produced pretty reliably, but when trying to use the function mlcs as an oracle in an algorithm for retrieving an actual longest sequence, sometimes the result is not what is expected. This may or may not indicate that there are cases for which an incorrect length is produced.

So on to the question: Is the idea of the length computing algorithm sound?

Asked By : user72935

Answered By : user72935

The algorithm is incorrect. That is it may return a length that is greater than the actual length of a longest common subsequence. Here is a test case which fails to produce the correct length: ['baaa', 'aaba', 'baba'].

Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/60584

3.2K people like this

Download Related Notes/Documents

MCA IGNOU GROUP

[Solved]: Algorithm for length of longest common subsequence

Asked By : user72935

Answered By : user72935

Best Answer from StackOverflow

Akshay Mishra

0 comments:

Post a Comment

Search By Course Code

Sponsored Links

Blog Archive

Popular Posts

Label Cloud

adsbygoogle

Ad Blocker

[Solved]: Algorithm for length of longest common subsequence

Asked By : user72935

Answered By : user72935

Best Answer from StackOverflow

Akshay Mishra

0 comments:

Post a Comment

Search By Course Code

Sponsored Links

Blog Archive

Popular Posts

Label Cloud

adsbygoogle

Ad Blocker

Click to Download Now