13.2.2. Set Cover
To apply randomized rounding to the set cover problem, we start with the LP relaxation of the set cover problem:
\[\begin{gathered} \text{Minimize } \sum_{S \in \mathcal{S}} w_Sx_S\\ \begin{aligned} \llap{s.t. } \sum_{e \in S} x_S &\ge 1 && \forall e \in U\\ x_S &\ge 0 && \forall S \in \mathcal{S} \end{aligned} \end{gathered}\tag{13.6}\]
As observed in Chapter 11, the upper bounds \(x_S \le 1\) are redundant and will never be violated by an optimal solution. Thus, an optimal solution \(\hat x\) of this LP satisfies \(0 \le \hat x_S \le 1\) for all \(S \in \mathcal{S}\).
Now consider an integral solution \(\tilde x\) obtained by setting \(\tilde x_S = 1\) with probability \(\hat x_S\) and \(\tilde x_S = 0\) with probability \(1 - \hat x_S\). This is the randomized part of the rounding procedure: Different runs of the rounding procedure may produce different solutions \(\tilde x\) from the same fractional solution \(\hat x\).
The expected cost of \(\tilde x\) is
\[E \left[ \sum_{S \in \mathcal{S}} w_S \tilde x_S \right] = \sum_{S \in \mathcal{S}} w_S E[\tilde x_S] = \sum_{S \in \mathcal{S}} w_S \hat x_S = \mathrm{OPT}_f,\]
but \(\tilde x\) is unlikely to be a feasible solution because there is a good chance that setting some of the variables \(\tilde x_S = 0\) leaves some elements in \(U\) uncovered. To boost the probability that we cover all elements in \(U\), we apply this randomized rounding process \(t = \lceil \ln n + 2 \rceil\) times, obtaining \(t\) random solutions \(\tilde x^{(1)}, \ldots, \tilde x^{(t)}\). We construct a solution \(\breve x\) by setting \(\breve x_S = 1\) if and only if there exists an index \(1 \le i \le t\) such that \(\tilde x_S^{(i)} = 1\). In other words, if we define
\[\tilde{\mathcal{C}}^{(i)} = \left\{ S \in \mathcal{S} \mid \tilde x_S^{(i)} = 1 \right\} \quad \forall 1 \le i \le t,\]
then the set \(\breve{\mathcal{C}}\) corresponding to \(\breve x\) is
\[\breve{\mathcal{C}} = \bigcup_{i=1}^t \tilde{\mathcal{C}}^{(i)}.\]
We have
\[E \bigl[ w \bigl( \breve{\mathcal{C}} \bigr) \bigr] \le \sum_{i=1}^t E \Bigl[ w \Bigl( \tilde{\mathcal{C}}^{(i)} \Bigr) \Bigr] = t \cdot \mathrm{OPT}_f. \tag{13.7}\]
This is the solution we return.
The algorithm to construct \(\breve{\mathcal{C}}\) just described takes polynomial time because it takes polynomial time to solve (13.6) to obtain \(\hat x\), constructing the solutions \(\tilde x^{(1)}, \ldots, \tilde x^{(t)}\) by rounding \(\hat x\) takes polynomial time, and \(\breve{\mathcal{C}}\) is then easily constructed from these solutions in polynomial time. By the following lemma, \(\breve{\mathcal{C}}\) is an \(O(\lg n)\)-approximation of an optimal set cover with probability at least \(\frac{1}{2}\).
Lemma 13.7: With probability at least \(\frac{1}{2}\), \(\breve{\mathcal{C}}\) is a set cover of weight at most
\[4 \cdot \lceil \ln n + 2\rceil \cdot \mathrm{OPT}_f \le 4 \cdot \lceil \ln n + 2\rceil \cdot \mathrm{OPT}.\]
Proof: \(\breve{\mathcal{C}}\) may fail to be a set cover of weight at most \(4 \cdot \lceil\ln n + 2 \rceil \cdot \mathrm{OPT}_f = 4t \cdot \mathrm{OPT}_f\) for two reasons: it may fail to be a set cover or its weight may exceed \(4t \cdot \mathrm{OPT}_f\). We prove that each of these events happens with probability at most \(\frac{1}{4}\). Thus, \(\breve{\mathcal{C}}\) fails to be a set cover of weight at most \(4t \cdot \mathrm{OPT}_f\) with probability at most \(\frac{1}{2}\), that is, it is a set cover of weight at most \(4t \cdot \mathrm{OPT}_f\) with probability at least \(\frac{1}{2}\).
First let us bound the probability that \(w\bigl(\breve{\mathcal{C}}\bigr) > 4t \cdot \mathrm{OPT}_f\). By (13.7), we have
\[E\bigl[w\bigl(\breve{\mathcal{C}}\bigr)\bigr] \le t \cdot \mathrm{OPT}_f.\]
By Markov's inequality, any random variable \(X\) exceeds its expectation by a factor of at least \(f\) with probability at most \(\frac{1}{f}\):
\[P[X \ge f \cdot E[X]] \le \frac{1}{f}.\]
Thus,
\[P\bigl[w\bigl(\breve{\mathcal{C}}\bigr) > 4t \cdot \mathrm{OPT}_f\bigr] \le P\bigl[w\bigl(\breve{\mathcal{C}}\bigr) > 4 \cdot E\bigl[w\bigl(\breve{\mathcal{C}}\bigr)\bigr]\bigr] \le \frac{1}{4}.\]
Next let us bound the probability that \(\breve{\mathcal{C}}\) is not a set cover. Consider any element \(u \in U\) and assume it is contained in \(k\) sets \(S_1, \ldots, S_k \in \mathcal{S}\). Any fixed set collection \(\tilde{\mathcal{C}}^{(i)}\) does not cover \(u\) only if \(\tilde x_{S_j}^{(i)} = 0\) for all \(1 \le j \le k\). This happens with probability \(\prod_{j=1}^k\bigl(1 - \hat x_{S_j}\bigr)\) because \(P\bigl[\tilde x_{S_j}^{(i)} = 0\bigr] = 1 - \hat x_{S_j}\) for all \(1 \le j \le k\). Since \(\hat x\) is a feasible solution of (13.6), we have \(\sum_{j=1}^k \hat x_{S_j} \ge 1\). Subject to this constraint, the product \(\prod_{j=1}^k\bigl(1 - \hat x_{S_j}\bigr)\) is maximized if \(\hat x_{S_j} = \frac{1}{k}\) for all \(1 \le j \le k\). This can be shown using elementary calculus. Thus, \(\tilde{\mathcal{C}}^{(i)}\) leaves \(u\) uncovered with probability at most
\[\left(1 - \frac{1}{k}\right)^k \le \frac{1}{e}.\]
The probability that \(\breve{\mathcal{C}}\) does not cover \(u\) is the probability that none of the sets \(\tilde{\mathcal{C}}^{(1)}, \ldots, \tilde{\mathcal{C}}^{(t)}\) covers \(u\). Since the probability that any set \(\tilde{\mathcal{C}}^{(i)}\) leaves \(u\) uncovered is at most \(\frac{1}{e}\) and the sets \(\tilde{\mathcal{C}}^{(1)}, \ldots, \tilde{\mathcal{C}}^{(t)}\) are chosen independently, the probability that none of the sets \(\tilde{\mathcal{C}}^{(1)}, \ldots, \tilde{\mathcal{C}}^{(t)}\) covers \(u\) is at most
\[\left(\frac{1}{e}\right)^t \le \left(\frac{1}{e}\right)^{\ln n + 2} < \frac{1}{4n}.\]
The probability that \(\breve{\mathcal{C}}\) is not a set cover of \(U\) is the probability that there exists an element \(u \in U\) that is not covered by \(\breve{\mathcal{C}}\). Since each element is covered by \(\breve{\mathcal{C}}\) with probability at least \(1 - \frac{1}{4n}\), the probability that there exists an element not covered by \(\breve{\mathcal{C}}\) is thus at most
\[n \cdot \frac{1}{4n} = \frac{1}{4}.\ \text{▨}\]
By Lemma 13.7, we can compute in polynomial time a collection of sets \(\breve{\mathcal{C}} \subseteq \mathcal{S}\) that is a set cover of weight at most \(4t \cdot \mathrm{OPT}_f\) with probability at least \(\frac{1}{2}\). We can also easily test in polynomial time whether \(\breve{\mathcal{C}}\) is a set cover and whether its weight is at most \(4t \cdot \mathrm{OPT}_f\) because we know \(\hat x\) and thus \(\mathrm{OPT}_f\). If \(\breve{\mathcal{C}}\) is not a set cover or its weight exceeds \(4t \cdot \mathrm{OPT}_f\), we simply construct a new solution \(\breve{\mathcal{C}}\) by repeating the rounding process. We repeat this process until \(\breve{\mathcal{C}}\) is a set cover of weight at most \(4t \cdot \mathrm{OPT}_f\). Since \(\breve{\mathcal{C}}\) is a set cover of weight at most \(4t \cdot \mathrm{OPT}_f\) with probability at least \(\frac{1}{2}\), Lemma 13.6 shows that the expected number of repetitions of the rounding process is \(2\). This gives the following corollary.
Corollary 13.8: An \(O(\lg n)\)-approximation of an optimal set cover can be computed in expected polynomial time.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.