World's most popular travel blog for travel bloggers.

[Solved]: Fixed size set to contain the maximum number of given sets

, , No Comments
Problem Detail: 

I asked this question in SO here

I have about 1000 sets of size <=5 containing numbers 1 to 100.

{1}, {4}, {1,3}, {3,5,6}, {4,5,6,7}, {5,25,42,67,100} ...  

Is it possible to find a set of size 20 that contains the maximum number of given sets?

In other words, let $U = \{1, 2, ..., 100\}$ and $S \subseteq \{Z \in 2^U \mid |Z| \leq 5\}$. How can I find $X \subseteq U$ with $|X| = 20$ such that $|\{Y \in S \mid Y \subseteq X\}|$ is maximized?

Checking each of 100!/(80!*20!) sets or bulding all set combinations with size <=20 is inefficient.

Is there any (semi)efficient solution or is it np-complete?

The set {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20} contains 5 sets.

Asked By : albert

Answered By : InformedA

The decision version of the problem can be formulated as follows:

  • Given a collection of sets of numbers $S$. Each number is drawn from a set $U$. You want to know if there is a way to pick $k$ numbers from $U$ to form a set $X$ which is the superset of $s$ sets in $S$.

The following is a reduction from k-clique which is an NP-complete problem.

The k-clique problem has an input as a graph and asks if there is a way to pick $k$ nodes so that there is an edge between any two of the $k$ nodes.

We turn each node into a number. These numbers form the set $U$. We turn each edge into a set containing two numbers representing the corresponding two nodes of the edge. These sets form the set $S$.

The question to the original problem is: Is there a way to pick $k$ numbers from $U$ to form a set $X$ such that $X$ is the superset of $C_{2}^k$ sets in $S$

The proof has two directions.

  1. If there is a k-clique in the graph, then we can pick the $k$ numbers corresponding to $k$ nodes in the clique to form $X$. Because of the way we transform the graph, we know there will be $C_{2}^k$ sets in $S$ each corresponds to an edge in the original graph and thus is a subset of $X$. So the answer to the original problem is positive.
  2. If there is no clique of size $k$, then for any set of $k$ nodes chosen from the graph one can only have less than $C_{2}^k$ undirected edges between any two picked nodes. Otherwise, we will have a clique of size $k$. This means that there can only be less than $C_{2}^k$ subsets of $S$ which is a subset of the set $X$ formed by any $k$ chosen numbers. So the answer to the original problem is not positive.

The transformation is in poly-time. The reduction is proven, so the problem is NP-hard. The problem is clearly in NP because we check how many sets covered by $k$ numbers in poly time. This means the original problem is NP-complete.

Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/32155

 Ask a Question

 Download Related Notes/Documents

0 comments:

Post a Comment

Let us know your responses and feedback