World's most popular travel blog for travel bloggers.

# [Solved]: Maximum minimal set coverage

, ,
Problem Detail:

Suppose we are given a universal set $U$ and a family of subsets of $U$, denoted by $F$ (elements in $F$ are subsets of $U$). We assume that all elements in $F$ can cover $U$, i.e., $U\subseteq \bigcup_{e\in F}e$. Consider the following max-cost set covering problem:

$\text{argmax}_{E\subseteq F} |E|$

s.t. $U\subseteq \bigcup_{e\in E}e$ and $U\not\subseteq \bigcup_{e\in E'}e,\forall E'\varsubsetneqq E$

I failed to find any literature addressing this. I was just wondering if I could have some references with regard to this problem. Thank you!

## Hardness

Your problem might be called "maximum minimal set cover". In particular, your problem feels like it might be related to minimum maximal matching, which is NP-hard. If you wanted to try to prove NP-hardness, you might try to see if you can find a reduction from minimum maximal matching.

## Algorithms

Your problem can be expressed as an instance of SAT, and then you could apply an off-the-shelf SAT solver. In particular, we formulate the decision problem variant, where we are given $F$ and an integer $k$, and we ask whether there exists a maximal cover $E$ such that $|E|=k$. To express this decision problem in SAT, we introduce constraints to capture two different aspects of the problem:

• The requirement that $E$ be a cover of size $k$: We introduce boolean variables $x_f$ (one for each $f \in F$), with the intent that $E=\{f \in F : x_f = \text{true}\}$. It is easy to write down boolean constraints on the $x_f$'s to express the requirement that $E$ be a cover (i.e., for each $u \in U$, you have a clause $\lor_{u \in f} \; x_f$) and that $|E|=k$ (see this answer for several ways to express the latter in SAT).

• The requirement that no subset of $E$ be a cover: We introduce the boolean variables $y_g$ (one for each $g \in F$), with the intent that $y_g$ is true if and only $E\setminus\{g\}$ is a cover. It is straightforward to write down boolean constraints to ensure that each $y_g$ has the appropriate value (i.e., $y_g = \land_{u \in U} \lor_{u \in f, f \ne g} \; x_f$; now apply the Tseitin transform). Then, we can add the clause $\neg x_g \lor \neg y_g$ for each $g \in F$.

Now there exists a maximal cover $E$ of size $k$ if and only if there is some satisfying assignment for this SAT instance. So, run an off-the-shelf SAT solver on this, and do binary search over $k$ to find the largest such cover, and you are done.

Alternatively, you can express your problem as an instance of integer linear programming, and then apply an off-the-shelf ILP solver. This might provide reasonable approximations to the optimum in practice, as many ILP solvers have heuristics to try to find an approximate optimum (i.e., a valuation to the variables that makes the objective function as large as they're able to).

One approach to formulate this as an ILP instance is similar to that for SAT:

• The requirement that $E$ be a cover of size $k$: We introduce an integer variable $k$. Also, we introduce integer variables $x_f$ (one for each $f \in F$) that are constrained to be zero or one. Zero plays the role of false, one the role of true. Thus, the intent is that $E=\{f \in F : x_f=1\}$. It is easy to write down linear constraints to express the requirement that $E$ be a cover (i.e., for each $u \in U$, we have the inequality $\sum_{u \in f} x_f \ge 1$) and that $E$ have size $k$ (i.e., $\sum_f x_f = k$).

• The requirement that no subset of $E$ be a cover: We introduce linear constraints to express the requirement that if $x_f=1$, then $E\setminus \{g\}$ is not a cover, for all $g \in F$. This can be done using the methods described here: Express boolean logic operations in zero-one integer linear programming (ILP).

In particular, here's what it looks like. Introduce zero-or-one integer variables $y_{g,u}$ for each $g \in F$ and each $u \in U$, where $y_{g,u}=1$ is intended to represent that $E \setminus \{g\}$ does not cover $u$. In particular, for each $u \in U$ and each $g \in F$, add the inequality

$$\sum_{u \in f,f \ne g} x_f \le K - K y_{g,u},$$

where $K$ is a sufficiently large constant ($K=|U|$ will suffice).

Also, for each $g \in F$, add the inequality

$$\sum_{u \in U} y_{g,u} \ge 1.$$

This ensures that, for each $g \in F$, $E\setminus\{g\}$ is not a cover (since there exists $u \in U$ such that $E\setminus\{g\}$ does not cover $u$).

• Maximize $k$: You want to maximize $k$. So, your integer linear program is: maximize $k$, subject to the linear inequalities mentioned above.

Now you can apply an off-the-shelf ILP solver to this ILP instance.