World's most popular travel blog for travel bloggers.

# [Answers] Simplified Maximum Diversity Problem

, ,
Problem Detail:

The Maximum Diversity Problem calls for choosing $m$ items from a list of $n$ items, such that the diversity defined as some metric distance between items is maximized.

I have a simpler problem, which I was hoping I could solve in a simpler manner. In my case I have a list of $n$ items each with a certain non-unique key. I want to chose $m$ items from my list so that the maximal number of items per key is minimized.

e.g., if my list is:

('a', 5), ('b', 4), ('c', 2), ('a', 6), ('b', 5) 

and we must choose $m=3$ items, an optimal solution would be a list containing one item for each key.

Is there an algorithm for doing this that is simpler than those for the Maximum Diversity Problem?

Thinking about it a a bit more, I think I have a "one pass" algorithm for solving this.

Let there be $n$ items with $k$ keys, with $n_i$ items for each key. We want to choose $m_i\le n_i$, such that $\sum m_i = m$, and that $\max{m_i}$ is minimized.

Clearly, if for all $n_i$ we have $n_i > m/k$, we simply choose $m_i = m/k$. Otherwise, we use the following algorithm:

1. Count the number of items for each key ($n_i$) and sort them in ascending order.
2. While $n_i \le m/k$ set $m_i=n_i$
3. For all $n_i > m/k$, choose $m_i = \left\lceil \frac{m - \sum_{j<i} m_j}{k-i+1}\right\rceil$

This algorithm should be $O(n)$ in the length of the list, assuming the number of different keys is negligible in comparison.