World's most popular travel blog for travel bloggers.

Counting sort on non-integers - why not possible?

, , No Comments
Problem Detail: 

What is it about the structure of counting sort that only makes it work on integers?

Surely strings can be counted?

''' allocate an array Count[0..k] ; initialize each array cell to zero ; THEN ''' for each input item x:     Count[key(x)] = Count[key(x)] + 1 total = 0 for i = 0, 1, ... k:     c = Count[i]     Count[i] = total     total = total + c  ''' allocate an output array Output[0..n-1] ; THEN ''' for each input item x:     store x in Output[Count[key(x)]]     Count[key(x)] = Count[key(x)] + 1 return Output 

Where above does the linear time break down if you try to use counting sort on strings instead (assuming you have strings of fixed length)?

Asked By : The Unfun Cat

Answered By : jmad

If you naively apply the algorithm you need an encoding $f$ of strings of length $m$ into integers such that $s_1≤s_2$ iff $f(s_1)≤f(s_2)$. The bound on these integers is necessarily at least $c^m$ where $c$ is the number of possible characters.

You then have a complexity in $O(nc^m)$ where $n$ is the number of elements to be sorted. This could be interesting when $c^m$ is not too big compared to the usual bound $\log n$, which is possible, in very peculiar cases.

In general, one would prefer the bound $O(n\log n)$ which does not depend on the length of your strings and in general much faster than $O(nc^m)$. To give a formal comparison between them, the former would be $O(s\log s)$ and the latter $O(sc^s)$ where $s$ is the size of the input.

However what is possible if you know that you will have a number of different strings no greater than some $k$, and that there will be a lot of occurrences of the same strings, is to build a correspondence between these strings and $\{1,\dots,k\}$ on which you will apply the counting sort, thus obtaining a complexity of $O(kn)$ i.e. linear (since $k$ is constant). But in this case the counting sort is not the only one reaching a linear complexity, for example all algorithms using balanced trees can be adapted to be linear, even insertion sort which is quadratic in the worst case, becomes linear.

Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/6641

0 comments:

Post a Comment

Let us know your responses and feedback