21.5. NP-Hardness and NP-Completeness

Given the definitions of P and NP, it is obvious that \(\textrm{P} \subseteq \textrm{NP}\): Every language that can be decided in polynomial time can also be verified in polynomial time. Indeed, given a deterministic Turing Machine \(M\) that decides \(\mathcal{L}\) in polynomial time, we can use it to verify \(\mathcal{L}\) in polynomial time. Given a pair of strings \((\sigma, \tau)\), we simply ignore \(\tau\) and run \(M\) on \(\sigma\). If \(\sigma \in \mathcal{L}\), this procedure accepts \(\sigma\) no matter the choice of \(\tau\). If \(\sigma \notin \mathcal{L}\), it rejects \(\sigma\) no matter the choice of \(\tau\). This meets the requirements of polynomial-time verification of \(\mathcal{L}\). In fact, it is stronger in the case when \(\sigma \in \mathcal{L}\).

One of the most tantalizing questions in Computer Science is whether this inclusion is strict, whether \(\textrm{P} \ne \textrm{NP}\). While nobody has a conclusive answer to this question, the evidence is overwhelming that the answer should be yes. I'll comment on this in a little more detail below. For now, let us assume that \(\textrm{P} \ne \textrm{NP}\). Then the following definition captures the idea that a decision problem cannot be solved in polynomial time.

A formal language \(\mathcal{L}\) (decision problem \(\Pi\)) is NP-hard if \(\mathcal{L} \in \textrm{P}\) implies that \(\textrm{P} = \textrm{NP}\). \(\mathcal{L}\) is NP-complete if \(\mathcal{L}\) is NP-hard and \(\mathcal{L} \in \textrm{NP}\).

Stated differently, this says that if we can find a polynomial-time algorithm to solve an NP-hard decision problem \(\Pi\), then \(\textrm{P} = \textrm{NP}\). Thus, if we believe that \(\textrm{P} \ne \textrm{NP}\), then no such polynomial-time algorithm to solve \(\Pi\) can exist.

To prove that a problem is unlikely to have a polynomial-time solution—unless \(\textrm{P} = \textrm{NP}\)—it suffices to prove that the problem is NP-hard. An NP-hard problem does not even have to be known to be in NP. For a problem to be NP-complete, it also has to be in NP. This captures the idea that NP-complete problems are the hardest problems in NP: We know how to verify a solution in polynomial time but we cannot find such a solution in polynomial time unless \(\textrm{P} = \textrm{NP}\).

So what's this evidence that \(\textrm{P} = \textrm{NP}\) is highly unlikely? Since every NP-complete problem is itself in NP, the conclusion that \(\textrm{P} = \textrm{NP}\) would imply that we can solve every NP-complete problem in polynomial time. In other words, by coming up with a polynomial-time solution to just one NP-hard problem, we obtain for free polynomial-time solutions to all NP-complete problems. There are by now thousands of problems that are known to be NP-complete and which have eluded all attempts by the brightest minds in Computer Science to design polynomial-time algorithms for these problems over the last six decades. Do we really believe that we can obtain polynomial-time algorithms for all of them in one fell swoop, by solving just one of them in polynomial time?

This is not a proof that \(\textrm{P} \ne \textrm{NP}\). You can choose to believe that \(\textrm{P} = \textrm{NP}\) or that \(\textrm{P} \ne \textrm{NP}\). Whatever you choose to believe, however, if a problem is NP-hard, then the stakes are very high when searching for a polynomial-time algorithm for this problem: If you succeed, you prove that \(\textrm{P} = \textrm{NP}\) and thereby settle one of the most iconic questions in Computer Science.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Algorithms II

21.5. NP-Hardness and NP-Completeness