Has some sort of canonical - or reference - implementation of Dual-pivot Quicksort been posted anywhere?
I would like to include that algorithm in a comparison among sorting algorithms for a specialized need that I have, but the Java versions I've seen appear to have various kinds of tweaks applied to them, like using Insertion Sort for small (sub-) arrays, which makes it harder to compare the fundamentals.
Asked By : 500 - Internal Server Error
Answered By : Sebastian
I tried to do exactly such a comparison in my master thesis, which thus naturally includes pseudo-code of "basic" versions of several dual-pivot Quicksorts (there is a list of them on page 9).
Here is my basic implementation of Yaroslavskiy's algorithm (the dual-pivot scheme that is used in Java 7):
void sort(int[] A, int left, int right) { if (right > left) { // Choose outermost elements as pivots if (A[left] > A[right]) swap(A, left, right); int p = A[left], q = A[right]; // Partition A according to invariant below int l = left + 1, g = right - 1, k = l; while (k <= g) { if (A[k] < p) { swap(A, k, l); ++l; } else if (A[k] >= q) { while (A[g] > q && k < g) --g; swap(A, k, g); --g; if (A[k] < p) { swap(A, k, l); ++l; } } ++k; } --l; ++g; // Swap pivots to final place swap(A, left, l); swap(A, right, g); // Recursively sort partitions sort(A, left, l - 1); sort(A, l + 1, g - 1); sort(A, g + 1, right); } } void swap(int[] A, int i, int j) { final int tmp = A[i]; A[i] = A[j]; A[j] = tmp; }
The partitioning loop maintains the following invariant: (Initially, the "?"-area is the whole array, at the end it is gone; the indices are moved in the direction indicated by the arrays, so $\ell$ and $k$ start at the left, $g$ at the right end of $A$.)
I did a mathematical average case analysis for these algorithms to determine the expected number of key comparisons, element swaps and also instruction counts for some specific low-level implementations (in Java Bytecode and MMIX, an RISC assembly language).
The main result is that Yaroslavskiy's algorithm uses $1.9n \ln n + O(n)$ comparison, which is 5% less than the $2n \ln n + O(n)$ comparisons of classic single-pivot Quicksort (both average case on random permutations). However, it needs $0.6 n\ln n + O(n)$ swaps for that, where classic Quicksort only needs $\frac13 n \ln n + O(n)$ swaps; similarly, my instruction counts also indicate that classic Quicksort is more efficient. What exactly makes Java 7's dual-pivot algorithm faster remains an open question. (In case you are interested in more details on that, I could elaborate further.)
Tweaks of the algorithm like switching to Insertionsort for subproblems smaller than some (constant) threshold or optimizing code outside the partitioning loop (while(k <= g) { ... }
) will only change the numbers hidden by $O(n)$. This means that for large lists, they are not important. For practical input sizes, they still help a lot, so having them in library code is a must.
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/24092
0 comments:
Post a Comment
Let us know your responses and feedback