21.6. Polynomial-Time Reductions

Now hat we know that an NP-hard problem is unlikely to have a polynomial-time algorithm that solves it, how do we prove that a problem is NP-hard?

For the first problem ever to be proven to be NP-hard, the satisfiability problem, the argument goes roughly like this: Given any non-deterministic Turing Machine \(M\) that decides some language \(\mathcal{L}\), we can, in polynomial time, convert any string \(\sigma \in \Sigma^*\) into a Boolean formula \(F_\sigma\) whose length is polynomial in the length of \(\sigma\). The exact polynomial depends on how long \(M\) takes to decide whether \(\sigma \in \mathcal{L}\). \(F_\sigma\) describes all possible computations of \(M\) on the input \(\sigma\) (remember that the non-deterministic choices that \(M\) can make \(M\) execute different steps and produce a different result when run on the same input \(\sigma\)). This formula has the property that it is satisfiable if and only if there exists a set of non-deterministic choices that make \(M\) accept \(\sigma\), that is, if and only if \(\sigma \in \mathcal{L}\). Thus, by deciding whether \(F_\sigma\) is satisfiable, we can decide whether \(\sigma \in \mathcal{L}\). Since we can construct \(F_\sigma\) from \(\sigma\) in polynomial time, we can decide whether \(\sigma \in \mathcal{L}\) in polynomial time if we have an algorithm \(\mathcal{A}\) that decides in polynomial time whether \(F_\sigma\) is satisfiable: We convert \(\sigma\) into \(F_\sigma\) in polynomial time and then run \(\mathcal{A}\) on \(F_\sigma\) in polynomial time. Since this works for any non-deterministic Turing Machine \(M\), this shows that the existence of a polynomial-time algorithm \(\mathcal{A}\) to decide whether any Boolean formula is satisfiable implies that any language that can be decided in polynomial-time by a non-deterministic Turing Machine can also be decided in polynomial-time by a deterministic algorithm (by a deterministic Turing Machine): \(\textrm{P} = \textrm{NP}\). Therefore, the satisfiability problem is NP-hard.

Once we have our first NP-hard problem in hand, proving that other problems are also NP-hard becomes much easier. The tool to construct such proofs is the polynomial-time reduction:

A reduction from some language \(\mathcal{L}_1\) to another language \(\mathcal{L}_2\) is a deterministic algorithm \(\mathcal{R}\) that converts any input string \(\sigma \in \Sigma^*\) into a string \(\mathcal{R}(\sigma) \in \Sigma^*\) such that \(\sigma \in \mathcal{L}_1\) if and only if \(\mathcal{R}(\sigma) \in \mathcal{L}_2\). \(\mathcal{R}\) is a polynomial-time reduction if it runs in \(O(|\sigma|^c)\) time, for some constant \(c\).

The usefulness of polynomial-time reductions to prove NP-hardness stems from the following lemma and its corollary.

Lemma 21.3: If \(\mathcal{L}_2 \in \textrm{P}\) and there exists a polynomial-time reduction from \(\mathcal{L}_1\) to \(\mathcal{L}_2\), then \(\mathcal{L}_1 \in \textrm{P}\).

Proof: Assume that \(\mathcal{R}\) is a polynomial-time reduction from \(\mathcal{L}_1\) to \(\mathcal{L}_2\). Let \(c\) be a constant such that \(\mathcal{R}\) runs in at most \(|\sigma|^c\) time on any input \(\sigma\). Since just writing down \(\mathcal{R}(\sigma)\) takes \(|\mathcal{R}(\sigma)|\) time, we have \(|\mathcal{R}(\sigma)| \le |\sigma|^c\).

Next assume that \(\mathcal{L}_2 \in \textrm{P}\). Then there exists an algorithm \(\mathcal{D}\) that takes \(|\sigma|^d\) time to decide whether any given string \(\sigma \in \Sigma^*\) belongs to \(\mathcal{L}_2\).

We can use \(\mathcal{R}\) and \(\mathcal{D}\) to decide whether a string \(\sigma \in \Sigma^*\) belongs to \(\mathcal{L}_1\): We use \(\mathcal{R}\) to compute \(\mathcal{R}(\sigma)\) from \(\sigma\) and then apply \(\mathcal{D}\) to \(\mathcal{R}(\sigma)\) to decide whether \(\mathcal{R}(\sigma) \in \mathcal{L}_2\). Since \(\mathcal{R}\) is a reduction from \(\mathcal{L}_1\) to \(\mathcal{L}_2\), we have \(\sigma \in \mathcal{L}_1\) if and only if \(\mathcal{R}(\sigma) \in \mathcal{L}_2\), that is, the answer given by \(\mathcal{D}\) for the input \(\mathcal{R}(\sigma)\) reflects whether \(\sigma \in \mathcal{L}_1\). We thus have a correct decision algorithm for \(\mathcal{L}_1\).

The running time of this algorithm is polynomial in \(|\sigma|\): Running \(\mathcal{R}\) on \(\sigma\) takes at most \(|\sigma|^c\) time. Running \(\mathcal{D}\) on \(\mathcal{R}(\sigma)\) takes at most \(|\mathcal{R}(\sigma)|^d\) time. Since \(|\mathcal{R}(\sigma)| \le |\sigma|^c\), running \(\mathcal{D}\) on \(\mathcal{R}(\sigma)\) thus takes at most \(|\sigma|^{cd}\) time. The total running time of the algorithm is therefore \(|\sigma|^c + |\sigma|^{cd}\), which is polynomial in \(|\sigma|\). ▨

Corollary 21.4: If \(\mathcal{L}_1\) is NP-hard and there exists a polynomial-time reduction from \(\mathcal{L}_1\) to \(\mathcal{L}_2\), then \(\mathcal{L}_2\) is also NP-hard.

Proof: By Lemma 21.3, the existence of a polynomial-time reduction from \(\mathcal{L}_1\) to \(\mathcal{L}_2\) means that \(\mathcal{L}_1 \in \textrm{P}\) if \(\mathcal{L}_2 \in \textrm{P}\). Since \(\mathcal{L}_1\) is NP-hard, \(\mathcal{L}_1 \in \textrm{P}\) implies that \(\textrm{P} = \textrm{NP}\). By combining these two implications, we obtain that \(\mathcal{L}_2 \in \textrm{P}\) implies that \(\textrm{P} = \textrm{NP}\), that is, \(\mathcal{L}_2\) is NP-hard. ▨

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Algorithms II

21.6. Polynomial-Time Reductions