World's most popular travel blog for travel bloggers.

[Solved]: Radix sort exercise

, , No Comments
Problem Detail: 

I'm trying to understand a proof regarding radix-sort but to no avail.

I'll first write down a brief summary of the proof and then assign some questions which I hope will be clarified enough.

Suppose you have an array of integer numbers in the range $\{0,1,2,\ldots n \log n\}$. Show and explain an effective algorithm in the worst case which finds out if there are two elements with the same value. note - You can use an extra memory in order of magnitude equals to $O(n)$.

I don't really care about how the algorithm will look like. The idea to this proof is using radix-sort in $O(n)$ and then looking for two elements with the same value as the array is sorted.

The proof outline:

Let's suppose we are examining a larger domain which is $\{0,1,2 ,\ldots n^2 - 1\}$. Now we'll treat each number according to its binary representation using radix-sort bit by bit.

Right after this, comes the part I can't understand at all....

As we know the order of magnitude of radix-sort is $\Theta(d(n+k))$ and therefore all we have to decide is which a to choose to have order of magnitude equals to $O(n)$.

using the formula $\frac{2\log n}{a} (n + 2^a)$.

After this step, you just choose $a = \log n$ and you are done.

My questions are:

  1. How did the writer concluded the following formula: $\frac{2\log n}{a} (n + 2^a)$?

  2. Also, why do the writer prefer to work on the domain of $\{0,1,2 ,\ldots n^2 - 1\}$ instead the one given in the question?

Asked By : SyndicatorBBB

Answered By : William Macrae

  1. In the formula $\Theta(d(n+k))$, $d$ is the number of digits (iterations in the sort) and $k$ is the size of the buckets (the base you are working in). So, if you were sorting the numbers 0 through 999 with 10 buckets, there would be $d=3$ iterations (since the numbers have length 3 in base 10) and we would have $k=10$, which is the base.

    If we leave the base, $2^a$ to be determined later (I guess by using a power of two we can read a bits and let that determine the bucket), there are $\log_{2^a}(n^2) = \frac{\log_2 n^2}{\log_2(2^a)} = \frac{2 \log n}{a}$ iterations. $k= 2^a$ is just the base. So $d(n+k) = \frac{2 \log n}{a}(n + 2^a)$.

  2. The expanded range is to make the math prettier. Above we took $\log n^2$, which worked out nicely to give a constant factor 2 in our assymptotic. If we instead had $\log (n \log n)$, things would not be as nice. It should still work (possibly for a different choice of $a$), and if you're comfortable with big-O work you can do it as an exercise, but the proof is much cleaner when you don't have to argue that $O(\log n + \log \log n) = O(\log n)$.

Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/7051

0 comments:

Post a Comment

Let us know your responses and feedback