7.4.4. The Hopcroft-Karp Algorithm*
The algorithm for the bipartite maximum-cardinality matching problem discussed in this section improves on the simple algorithm from Section 7.4.3 and achieves a running time of \(O\bigl(n + m\sqrt{n}\bigr)\). This algorithm is due to Hopcroft and Karp. The strategy it uses to improve on the running time of the simple \(O(nm)\)-time algorithm has many parallels to Dinitz's maximum flow algorithm: Instead of finding one augmenting path at a time, we find many such paths in each iteration. To find these paths, we construct a level graph and then use DFS in this graph to find a set of augmenting paths equivalent to a blocking flow. This ensures that the length of a shortest augmenting path increases from one iteration to the next and is the key to proving that the algortihm terminates after \(O\bigl(\sqrt{n}\bigr)\) iterations.
Let us call an augmenting path \(P\) a shortest augmenting path if there is no shorter augmenting path than \(P\).1
We call a set \(A = \{P_1, \ldots, P_t\}\) of shortest augmenting paths inclusion-maximal if the paths in \(A\) are vertex-disjoint and every shortest augmenting path not in \(A\) shares at least one vertex with some path in \(A\). In this case, \(M \oplus (P_1 \cup \cdots \cup P_t)\) is a matching of size \(|M| + t\).
The Hopcroft-Karp algorithm starts with some matching \(M\) that can be found in \(O(n + m)\) time. The trivial matching \(M = \emptyset\) or a maximal matching fits the bill. Each iteration of the algorithm finds an inclusion-maximal set of shortest augmenting paths \({P_1, \ldots, P_t}\) and replaces \(M\) with \(M \oplus (P_1 \cup \cdots \cup P_t)\). The algorithm ends once there is no augmenting path, that is, once the set of paths found in the current iteration is empty. By Corollary 7.9, \(M\) is a maximum-cardinality matching at this point.
Theorem 7.13: The Hopcroft-Karp algorithm computes a maximum-cardinality matching in a bipartite graph in \(O\bigl(n + m\sqrt{n}\bigr)\) time.
Proof: We just argued that the algorithm computes a maximum-cardinality matching. We prove in Section 7.4.4.1 that an inclusion-maximal set of shortest augmenting paths can be found in \(O(n + m)\) time. If \(G\) has no isolated vertices, then \(n \le 2m\), so finding an inclusion-maximal set of shortest augmenting paths takes \(O(m)\) time. To ensure that \(G\) has no isolated vertices, we remove these vertices from \(G\), which takes \(O(n)\) time. As we show next, the Hopcroft-Karp algorithm runs for \(O\bigl(\sqrt{n}\bigr)\) iterations. Thus, its total running time is \(O\bigl(n + m\sqrt{n}\bigr)\).
By Lemma 7.14 below, after the first \(\sqrt{n}\) iterations, every augmenting path for the current matching \(M\) has length at least \(\sqrt{n}\). Let \(M'\) be a maximum-cardinality matching and let \(S = M \oplus M'\). By Lemma 7.8, \(S\) contains at least \(|M'| - |M|\) vertex-disjoint augmenting paths with respect to \(M\). Since each such path has length at least \(\sqrt{n}\), there are at most \(\sqrt{n}\) such paths. Thus, \(|M'| - |M| \le \sqrt{n}\).
Since each iteration of the Hopcroft-Karp algorithm increases the size of the current matching by at least one, this shows that it takes at most \(\sqrt{n}\) additional iterations to obtain a maximum-cardinality matching. The total number of iterations is thus at most \(2\sqrt{n}\). ▨
Lemma 7.14: Every iteration of the Hopcroft-Karp algorithm increases the length of the shortest augmenting path with respect to the current matching \(M\).
Proof: Let \(M\) be the matching at the beginning of the current iteration, let \(M'\) be the matching at the end of the current iteration, let \(A = \{P_1, \ldots, P_t\}\) be the inclusion-maximal set of shortest augmenting paths we use to construct \(M'\) from \(M\), and let \(k\) be the length of the paths in \(A\).
Consider any augmenting path \(P = \langle v_0, \ldots, v_{k'} \rangle\) with respect to \(M'\). We need to show that \(k' > k\).
If \(P\) is vertex-disjoint from all paths in \(A\), then \(P\) is also an augmenting path for \(M\) and \(A \cup {P}\) is a collection of vertex-disjoint augmenting paths for \(M\). Since \(A\) is an inclusion-maximal set of shortest augmenting paths for \(M\), \(P\) is not a shortest augmenting path for \(M\), that is, \(k' > k\).
So assume that \(P\) shares a vertex \(v\) with at least one path \(P_i\) in \(A\). Let \(S = (P_1 \cup \cdots \cup P_t) \oplus P\). Since \(M' = M \oplus (P_1 \cup \cdots \cup P_t)\), we have \(P_1 \cup \cdots \cup P_t = M \oplus M'\), that is, \(S = (M \oplus M') \oplus P = M \oplus (M' \oplus P)\). \(M' \oplus P\) is a matching of size \(|M'| + 1 = |M| + t + 1\) because \(P\) is an augmenting path for \(M'\). Thus, by Lemma 7.8, \(S\) is a collection of paths and cycles that includes at least \(|M' \oplus P| - |M| = t + 1\) augmenting paths for \(M\). Since any augmenting path for \(M\) has length at least \(k\), the set \(S\) thus contains at least \(k(t+1)\) edges. Since every vertex in \(P_i\) has an incident edge in \(M'\) and the endpoints of \(P\) are unmatched by \(M'\), \(v\) must be an internal vertex of \(P\). Its incident edge in \(M'\) belongs to both \(P_i\) and \(P\) because \(v\) has an incident edge in \(M'\) that belongs to \(P\) and an incident in \(M'\) that belongs to \(P_i\) but \(M'\) is a matching and thus contains at most one edge incident to \(v\). Thus, \(P\) and \(P_i\) share at least one edge. This implies that \(|S| \le |P_1 \cup \cdots \cup P_t \cup P| \le kt + k' - 1\). Since \(|S| \ge k(t+1)\), this gives \(kt + k' - 1 \ge k(t+1)\), that is, \(k' \ge k + 1\). ▨
How's that for an obvious definition?

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.