World's most popular travel blog for travel bloggers.

# How to find greatest set intersection of at least a given cardinality?

, ,
Problem Detail:

While dealing with a problem, I uncovered this subproblem:

Input: A set of sets $S = \{S_1,...,S_r\}$ where $\mid$ $S_1$ $\cup$ ... $\cup$ $S_r$$\mid = n, as well as a number k<n. Output: A subset of S, T = \{S_{t_{1}},...,S_{t_{p}}\} so that \mid S_{t_{1}} \cap ... \cap S_{t_{p}}$$\mid$ $\geq k$ with $p=\mid T \mid$ maximized.

I.e. find the greatest number of sets in $S$ whose intersection preserves at least $k$ elements.

Is there any way of solving this problem in reasonable time, say polynomial in $r$ and $n$?

I tried to create an algorithm to solve the problem in reasonable time for large sets and $n$ and got nowhere. There's the obvious brute force algorithm that runs in $n2^r$ time. I also tried to construct a DP algorithm that used smaller subsets for optimal substructure, i.e. making use of $(A \cap B) \cap C = A \cap B \cap C$ and that if $\mid A \cap B \mid < k$, then it's not worth searching any further with $A\cap B$. However, I didn't get anywhere since the number of different subsets grows too quickly.

On the one hand, the problem feels kind of "set coverish" and thus I suspect it's hard. On the other, I'd imagine a problem this simple to be mentioned in some list of hard problems if it were intractable.

###### Answered By : Yuval Filmus

Here is how to reduce clique to your problem. Given a graph $G = (V,E)$ and a number $\ell$, for each vertex $x$ let $$S_x = \{\{y,z\} \in \binom{V}{2} : y,z \neq x\| \cup \{ \{x,y\} : \{x,y\} \in E \}.$$ In words, $S_x$ contains all edges in the graph touching $x$, and all edges not touching $x$, whether or not they're in the graph.

One can then show that for every set of vertices $A$, $$\bigcap_{x \in A} S_x = \{\{y,z\} \in \binom{V}{2} : y,z \notin A\} \cup \{ \{x,y\} : x \in A \text{ and } \{x,y\} \in E \}.$$ In words, the intersection of $S_x$ for $x \in A$ consists of all edges in the graph touching $A$, as well as all edges not touching $A$ (whether in the graph or not). The size of this intersection reaches its maximum $\binom{|V|-|A|}{2} + \binom{|A|}{2}$ if and only if $A$ is a clique in the graph.

Choosing $k = \binom{|V|-\ell}{2} + \binom{\ell}{2}$, the graph contains an $\ell$-clique if and only if there are $p=\ell$ sets whose intersection contains at least $k$ points. The value of $n$ for this instance is $n = \binom{|V|}{2}$.