Suppose I have a large list of numbers that I want to divide into equal-sized buckets so that every bucket contains only larger numbers than buckets to its left. Numbers within each bucket don't need to be sorted.
For example, for the input
[64, 72, 57, 47, 5, 4, 64, 21, 64, 65, 36, 43, 81, 44, 19, 87, 17, 86, 73, 21, 19, 64, 94, 91, 34, 49, 8, 52, 18, 37] that I want to divide into 5 buckets, some valid outputs would be
[[4, 5, 8, 17, 18], [19, 19, 21, 21, 34], [36, 37, 43, 44, 47], [49, 52, 57, 64, 64], [64, 64, 65, 72, 73]] [[8, 17, 18, 5, 4], [21, 19, 21, 34, 19], [43, 37, 44, 36, 47], [52, 57, 49, 64, 64], [64, 64, 72, 73, 65]] [[8, 18, 5, 4, 17], [19, 19, 21, 21, 34], [47, 36, 44, 43, 37], [52, 64, 49, 64, 57], [64, 72, 64, 65, 73]] [[18, 8, 17, 5, 4], [21, 19, 21, 19, 34], [36, 44, 43, 37, 47], [57, 64, 64, 49, 52], [64, 72, 64, 65, 73]] One approach to do this would be to sort the list, and then divide it at equal intervals. Can this be done faster?
I would also be happy with a result where the buckets are of only roughly equal size, but note that I can't assume a uniform distribution for the numbers, so bucket sort doesn't help.
Asked By : mrueg
Answered By : Raphael
Assuming you want to create $k$ buckets, to the following.
- Obtain elements of rank $\lceil n/k \rceil, 2 \cdot \lceil n/k \rceil \dots, (k-1) \cdot \lceil n/k \rceil$.
- Perform multi-way partitioning (cf. Quicksort) with these elements as pivot.
Both steps take time $\Theta(kn)$. All buckets are within $\pm 1$ of the same size.
In the case of duplicate elements, you can "uniquify" them in a $\Theta(n)$ preprocessing run; otherwise bucket sizes might differ more.
Essentially, this is performing only the top-level call of a $k$-way Quicksort with perfect pivots. All variants regarding pivot selection and partitioning apply and will lead to different runtime and bucket size characteristics.
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/28951
0 comments:
Post a Comment
Let us know your responses and feedback