World's most popular travel blog for travel bloggers.

[Solved]: Big-O Notation of Anagram solution algorithm

, , No Comments
Problem Detail: 

In Solution 1: Checking Off of Problem Solving with Algorithms and Data Structures, just beneath the ActiveCode: 1 extract (included at the bottom of this post for reference), it is stated:

each of the n characters in s1 will cause an iteration through up to n characters in the list from s2.

...which to paraphrase (less succinctly), means "each character in s1 could result in all characters in s2 being inspected".

The following sentence states:

Each of the n positions in the list will be visited once to match a character from s1.

Firstly I'm unsure which list is being referred, I think it means the variable alist, i.e. s2. Is this correct?

The number of visits then becomes the sum of the integers from 1 to n.

Secondly (and the real point of this post), if each item in the list (variable alist) is inspected for each character in s1, then why is this not simply O(n) = n^2 as per "nested loops"? If n=5, then then sum(n) = 15, where as n^2 = 25.

I'm fine with simplifying the equation for summing 1 to n integers to get n^2, but I'm confused about how sum(5) = 15, whereas O(n) = n^2.

enter image description here

ActiveCode: 1 Checking Off (active5)

def anagramSolution1(s1,s2):     alist = list(s2)      pos1 = 0     stillOK = True      while pos1 < len(s1) and stillOK:         pos2 = 0         found = False         while pos2 < len(alist) and not found:             if s1[pos1] == alist[pos2]:                 found = True             else:                 pos2 = pos2 + 1          if found:             alist[pos2] = None         else:             stillOK = False          pos1 = pos1 + 1      return stillOK  print(anagramSolution1('abcd','dcba') 

Edit to add

Having written this out it struck me that perhaps dropping the (1/2) and (1/2)n from the formula to sum(n) is the reason for the difference of actually summing n, as opposed to just doing n^2.

To be absolutely clear, I understand the formula for summing n is n^2 when dropping the non-dominant terms. What I don't understand is why we're summing n and not immediately and directly stating it is actually n^2 as a result of the nested loop and this sentence:

each of the n characters in s1 will cause an iteration through up to n characters in the list from s2.

enter image description here

Asked By : Jack

Answered By : unutbu

Let's be concrete. Suppose s1 = 'abcd' and s2 = 'dcba'.

For the a in s1, it will take 4 iterations to find the a in s2. For the b in s1, it will take 3 iterations to find the b in s2. For the c in s1, it will take 2 iterations to find the c in s2. For the d in s1, it will take 1 iteration to find the d in s2.

So in this case, it does take sum(i) for i = 1,...,n iterations.

Now some quiet meditation should convince you than if s2 is an anagram of s1, it will again take exactly the same number of iterations; the summands will just appear in a different order. (After all, whatever is the first character in s2 will only require 1 iteration to be found, the second character requires 2 iterations, etc.)

If s2 is not an anagram of s1, then the algorithm will short-circuit when the first character not in s2 is tested. So then the algorithm will take less than O(n^2) time. But in the worst case, the algorithm takes O(n^2) time.

Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/33618

0 comments:

Post a Comment

Let us know your responses and feedback