World's most popular travel blog for travel bloggers.

[Solved]: More efficient algorithm for determining if one list is a sublist of another list

, , No Comments
Problem Detail: 

I'm trying to build an algorithm which takes two lists of natural numbers and finds if every element of the first list is displayed at least once in the second list.

What if the list is sorted?

An algorithm that can do this is by comparing every element of the first list with every element from the second list. I think there is an algorithm with a better complexity. Can anyone give me any idea?

Asked By : Panarit

Answered By : Gilles

Let's start with the case when the lists are sorted. In that case, you can apply a simple modification of the basic merge algorithm on sorted lists: discard the elements instead of constructing a merged list, and only keep track of whether an element from list 1 was missing from list 2.

In the pseudo-code below, head(list) is the first element of list, and advance(list) means to discard the head of list. In other words, head(cons(h,t)) = h and advance(list) when list = cons(h,t) means list := t.

while list1 is not empty:     if list2 is empty or head(list1) < head(list2):         return false     else if head(list1) = head(list2):         let x = head(list1)         while list1 is not empty and head(list1) == x: advance(list1)         while list2 is not empty and head(list2) == x: advance(list2)     else: (head(list1) > head(list2))         advance(list2) return true 

Exercise: prove that this algorithm returns true when all the elements of list1 occur in list2 and false otherwise.

Let $n_1 = \mathrm{length}(\mathtt{list1})$, $n_2 = \mathrm{length}(\mathtt{list2})$ and $n = \max(n_1, n_2)$. The algorithm above removes at least one element of list2 at each iteration, and in the worst case it removes all the elements of both list. Therefore it executes at most $n_2$ iterations of the loop and performs at most $n_1 + n_2$ removals of the head of the list: its complexity is $O(n)$.

Now suppose that the lists are not sorted. An obvious solution is to sort them, then apply the known algorithm. Since sorting is $O(n \log n)$, this algorithm is $O(n \log n)$.

Exercise: make sure you understand why my statements about complexity are true.

Exercise (difficult!): can you do better than $O(n \log n)$ in the general case?

Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/18346

0 comments:

Post a Comment

Let us know your responses and feedback