13.2.1. Prelude: Randomized Algorithms

To discuss randomized rounding, we need to agree on some terminology concerning randomized algorithms.

The behaviour of a deterministic algorithm is completely determined by its input: It always produces the same output for a given input, and its running time is exactly the same every time we run the algorithm on the same input. The behaviour of a randomized algorithm depends on the input and on certain random choices the algorithm makes during its computation. As a result, the algorithm may produce different outputs when run several times on the same input. Whether two runs on the same input produce the same output or not, their running times may also differ substantially depending on the random choices the algorithm makes. Thus, both the output the algorithm produces and its running time are random variables whose distribution is determined by the probability distributions of the random choices the algorithm makes. This leads us to distinguish two types of randomized algorithms, named after the two Meccas of gambling, Monte Carlo and Las Vegas.

Monte Carlo Algorithms

A Monte Carlo randomized algorithm provides a very weak correctness guarantee. Such an algorithm is often guaranteed to terminate in a certain amount of time depending on the input size, just as a deterministic algorithm, but it may produce an incorrect output. All we can prove for such an algorithm is that the output it produces is correct with at least a certain probability. Some Monte Carlo algorithms may not even guarantee bounds on their running times and may terminate quickly only in expectation.

Las Vegas Algorithms

While we may be willing to accept modest variations in the running time of an algorithm, we are generally less happy if we cannot guarantee that the algorithm produces a correct output (unless the notion of correctness is rather artificial, for example, if the input is noisy to begin with). Thus, we may prefer a Las Vegas randomized algorithm. Such an algorithm is guaranteed to produce a correct output but its running time may vary depending on the random choices the algorithm makes. For such an algorithm, we are interested in its expected running time or, even better, in a bound on the running time which the algorithm exceeds with vanishing probability.

Throughout this section, we will focus on Monte Carlo algorithms for deriving solutions of ILPs from fractional solutions of their LP relaxations. By the following lemma, this is good enough because we can convert any Monte Carlo algorithm into a Las Vegas algorithm as long as we can efficiently verify whether the solution produced by the Monte Carlo algorithm is correct.

Lemma 13.6: Let \(\mathcal{M}\) be a Monte Carlo algorithm for some problem \(\Pi\) with expected running time \(T_\mathcal{M}(n)\) for any input of size \(n\). Assume that \(\mathcal{M}\) produces a correct output with probability at least \(p\). Also assume that we have an algorithm \(\mathcal{C}\) that we can use to check in expected time \(T_\mathcal{C}(n)\) whether the output \(O\) produced by \(\mathcal{M}\) for a given input \(I\) is correct. Then there exists a Las Vegas algorithm \(\mathcal{L}\) for \(\Pi\) with expected running time \(T_\mathcal{L}(n) = \frac{T_\mathcal{M}(n) + T_\mathcal{C}(n)}{p}\).

Proof: The algorithm \(\mathcal{L}\) is simple: Given an input \(I\), it runs \(\mathcal{M}\) on \(I\) to produce some output \(O\). Then it uses \(\mathcal{C}\) to check whether \(O\) is a correct output for \(I\). If the answer is yes, then \(\mathcal{L}\) returns \(O\) as a correct output for the input \(I\). If the answer is no, then \(\mathcal{L}\) starts over, running \(\mathcal{M}\) on \(I\) again to obtain a new output \(O'\). \(\mathcal{L}\) runs \(\mathcal{C}\) on this new output \(O'\) to check whether this output is correct. This process continues until \(\mathcal{C}\) confirms that the output produced by \(\mathcal{M}\) in the current iteration is a correct solution for the input \(I\). At that point, \(\mathcal{L}\) returns this output.

If \(\mathcal{L}\) runs for \(t\) iterations, then its expected running time is \(t \cdot (T_\mathcal{M}(n) + T_\mathcal{C}(n))\) because each iteration runs \(\mathcal{M}\) and \(\mathcal{C}\) once. Thus, the expected running time of \(\mathcal{L}\) is

\[T_\mathcal{L}(n) = \sum_{t=1}^\infty t \cdot (T_\mathcal{M}(n) + T_\mathcal{C}(n)) \cdot P[X = t],\]

where \(X\) is the number of iterations that \(\mathcal{L}\) executes, a random variable that depends on the random choices made by \(\mathcal{M}\) each time we run \(\mathcal{M}\). This can be rearranged as

\[\begin{aligned} T_\mathcal{L}(n) &= (T_\mathcal{M}(n) + T_\mathcal{C}(n)) \cdot \sum_{t=1}^\infty t \cdot P[X = t]\\ &= (T_\mathcal{M}(n) + T_\mathcal{C}(n)) \cdot \sum_{t=1}^\infty P[X \ge t]. \end{aligned}\]

For the algorithm to run for at least \(t\) iterations, the first \(t - 1\) iterations all need to produce incorrect outputs. Since \(\mathcal{M}\) produces a correct output with probability at least \(p\) and the runs of \(\mathcal{M}\) in the different iterations of \(\mathcal{L}\) are independent of each other, the probability that the first \(t - 1\) iterations all produce incorrect outputs is at most \((1 - p)^{t-1}\). Thus,

\[\begin{aligned} T_\mathcal{L}(n) &\le (T_\mathcal{M}(n) + T_\mathcal{C}(n)) \cdot \sum_{t=1}^\infty (1 - p)^{t-1}\\ &= (T_\mathcal{M}(n) + T_\mathcal{C}(n)) \cdot \sum_{t=0}^\infty (1 - p)^t\\ &= \frac{T_\mathcal{M}(n) + T_\mathcal{C}(n)}{p}.\ \text{▨} \end{aligned}\]


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.