10.1. Metric \(\boldsymbol{k}\)-Center

\(\boldsymbol{k}\)-Center Problem: Given a complete graph \(G = (V, E)\) and a weight function \(w : E \rightarrow \mathbb{R}\), choose a subset \(C \subseteq V\) of \(k\) centers such that the maximum distance from any vertex in \(G\) to its closest center is minimized. In other words, minimize the expression

\[\max_{x \in V \setminus C}\min_{y \in C}w_{x,y}.\tag{10.1}\]

Figure 10.1 provides an illustration.

Figure 10.1: Coming soon!

You can think of this as an abstract formulation of the facility location problem where we want to place \(k\) stores throughout the city so that the distance any resident has to travel to the closest store is minimized.

Exercise 10.1: Prove that if we do not impose any restrictions on the edge weights, then there exists no polynomial-time \(\alpha(n)\)-approximation algorithm for the \(k\)-center problem, for any polynomial-time computable function \(\alpha\), unless \(\textrm{P} = \textrm{NP}\). (Hint: Use ideas from the proof of Theorem 10.4 in Section 10.2.)

In order to obtain a version of the problem that can be approximated, we need to impose restrictions on the edge weights. A natural restriction certainly satisfied by the geometric interpretation of the problem as a facility location problem is that the edges should satisfy the triangle inequality:

\[w_{x,y} + w_{y,z} \ge w_{x,z} \quad \forall x, y, z \in G.\]

This turns the weight function into a metric. Thus, this is called the metric \(\boldsymbol{k}\)-center problem.

A word of warning before we begin: The following \(2\)-approximation algorithm for the metric \(k\)-center problem employs a series of reductions. It expresses the metric \(k\)-center problem as finding a subgraph of \(G\) that has a small dominating set. Then we show that \(G\) does not have a small dominating set if the square of \(G\), which we will define shortly, has a large independent set, and then we look for a large independent set in \(G^2\). This sequence of reductions does not seem to make any progress towards obtaining a problem we can actually solve in polynomial time. The \(k\)-center problem is NP-hard, as is the dominating set problem, as is the maximum independent set problem. The reason why we obtain an approximation of an optimal set of centers is that we do not solve the maximum independent set problem exactly. We only make the best effort we can make in polynomial time to find a large independent set. If we succeed, then we know that the current subgraph of \(G\) has no small dominating set, so it is the wrong subgraph. If we fail to find a large independent set, then such an independent set may exist. However, we can prove that the independent set we found, which is too small, is in fact a reasonable choice as a set of centers. So, until we finally go and look for an independent set in \(G^2\), simply ignore that we are reducing NP-hard problems to other NP-hard problems. We will in the end obtain a reduction to a problem that we can solve in polynomial time.

Let us start by sorting the edges in \(G\) by increasing weights:

\[w_{e_1} \le \cdots \le w_{e_m}.\]

For \(1 \le j \le m\), let \(G_j = (V, E_j)\), where \(E_j = \{e_1, \ldots, e_j\}\). In other words, \(G_j\) is the subgraph of \(G\) containing all vertices of \(G\) but only the \(j\) cheapest edges of \(G\).

Since \(\textrm{OPT} = \max_{x \in V \setminus C}\min_{y\ \in C} w_{x,y}\), where \(C\) is an optimal set of centers, \(\textrm{OPT}\) is the weight of some edge in \(G\), that is, \(\textrm{OPT} = w_{e_j}\), for some index \(j\). To compute \(\textrm{OPT}\), we thus have to find this index \(j\). To approximate \(\textrm{OPT}\), we need to approximate \(j\).

The following lemma shows that \(j\) is minimum index such that \(G_j\) has a small dominating set.

Given a graph G = (V, E), a dominating set is a subset \(D \subseteq V\) of vertices such that every vertex \(x \in V \setminus D\) has a neighbour in \(D\). This is illustrated in Figure 10.2.

Figure 10.2: Coming soon!

Dominating Set Problem: Given an undirected graph \(G = (V, E)\), find a dominating set of minimum size in \(G\).

Lemma 10.1: \(\mathrm{OPT} = w_{e_{j^*}}\), where \(j^*\) is the minimum index such that \(G_{j^*}\) has a dominating set of size \(k\).

Proof: Let \(D = \{y_1, \ldots, y_k\}\) be a dominating set of \(G_{j^*}\). Since every vertex \(x \in V\) is in \(D\) or has a neighbour in \(D\) and every edge in \(G_{j^*}\) has weight at most \(w_{e_{j^*}}\), we have

\[\min_{y \in D} w_{x,y} \le w_{e_{j^*}}\]

for every vertex in \(V \setminus D\). Thus, since \(|D| = k\),

\[\textrm{OPT} \le \max_{x \in V \setminus D} \min_{y \in D} w_{x,y} \le w_{e_{j^*}}.\]

Conversely, consider any set \(C = \{y_1, \ldots, y_k\}\) of \(k\) centers. For \(j = j^* - 1\), \(C\) is not a dominating set of \(G_j\) because \(|C| = k\) and \(G_j\) does not have a dominating set of size \(k\), by the choice of \(j^*\). Thus, there exists a vertex \(x \in V \setminus C\) such that the edge \((x,y)\) connecting it to its closest neighbour \(y \in C\) is not in \(G_j\). This edge therefore has weight \(w_{x,y} \ge w_{e_{j+1}} = w_{e_{j^*}}\), that is,

\[\max_{x \in V \setminus C} \min_{y \in C} w_{x,y} \ge w_{e_{j^*}}.\]

Since this is true for any choice of \(k\) centers, we have \(\mathrm{OPT} \ge w_{e_{j^*}}\). ▨

By Lemma 10.1, we can solve the \(k\)-center problem by finding the smallest index \(j^*\) such that \(G_{j^*}\) has a dominating set of size \(k\). Since the dominating set problem is NP-hard, we should not hope to find this index \(j^*\) in polynomial time. However, the following lemma provides a necessary condition for the existence of a small dominating set

Given a graph \(G = (V, E)\), the square of \(G\) is the graph \(G^2 = (V, E^2)\), where \(E^2 = E \cup \{(x, y) \mid N[x] \cap N[y] \ne \emptyset\}\). In other words, \(x\) and \(y\) are joined by an edge in \(G^2\) if and only if there exists a path of length at most \(2\) from \(x\) to \(y\) in \(G\). This is illustrated in Figure 10.3.

Figure 10.3: Coming soon!

\(G^2\) is the "square" of \(G\) in the sense that the adjacency matrix of \(G^2\) is \(M \vee M^2\), where \(M\) is the adjacency matrix of \(G\).

Lemma 10.2: \(G\) has a dominating set of size \(k\) only if \(G^2\) does not have an independent set of size \(k+1\).

Proof: Consider a dominating set \(D = \{y_1, \ldots, y_k\}\) of \(G\) of size \(k\). Note that \(N[y_i]\) induces a clique (a complete subgraph) in \(G^2\) for each vertex \(y_i \in D\) because for every pair of vertices \(x, z \in N[y_i]\), \(G\) contains a path of length at most two from \(x\) to \(z\) via \(y_i\). An independent set of \(G^2\) contains at most one vertex from each clique in \(G^2\). Thus, any independent set \(I\) of \(G^2\) contains at most one vertex from each neighbourhood \(N[y_i]\). Since \(D\) is a dominating set, \(V = N[y_1] \cup \cdots \cup N[y_k]\), so \(I\) contains no vertices not in any neighbourhood \(N[y_i]\). This shows that \(|I| \le k\). ▨

Based on Lemma 10.2, we can formulate the following approximation algorithm for the metric \(k\)-center problem:

For \(j = 1, \ldots, m\):
- Compute \(G_j\) and then \(G_j^2\) and find a maximal independent set \(I\) of \(G_j^2\). By Exercise 7.4, we can find such a set in \(O\bigl(n^2\bigr)\) time.
- If \(|I| > k\), then \(G_j\) has no dominating set of size \(k\), by Lemma 10.2. Thus, by Lemma 10.1, \(j < j^*\).
- If \(|I| \le k\), then we cannot guarantee that \(G_j\) has no dominating set of size \(k\), so we also cannot guarantee that \(j < j^*\). In this case, we return \(I\) as a set of centers. Clearly, \(I\) has the right size. By the next lemma, \(I\) is a \(2\)-approximation of an optimal set of \(k\) centers.

Lemma 10.3: \(\max_{x \in V \setminus I} \min_{y \in I} w_{x,y} \le 2 \cdot \mathrm{OPT}\).

Proof: Let \(j\) be the index of the iteration in which we return \(I\). Then \(j \le j^*\) because every index \(j' < j\) satisfies \(j' < j^*\). Thus, we have

\[w_{e_j} \le w_{e_{j^*}}.\]

Since \(w_{e_{j^*}} = \mathrm{OPT}\), it suffices to prove that

\[\max_{x \in V \setminus I} \min_{y \in I} w_{x,y} \le 2w_{e_j}.\]

Consider any vertex \(x \notin I\). Since \(I\) is a maximal independent set of \(G_j^2\), \(x\) has a neighbour \(y\) in \(G_j^2\) that belongs to \(I\). See Figure 10.4. Since \(y\) is a neighbour of \(x\) in \(G_j^2\), there exists a path from \(x\) to \(y\) in \(G_j\) comprised of at most two edges. Each of these edges has weight at most \(w_{e_j}\). Thus, by the triangle inequality, the edge \((x,y)\) has weight at most \(2w_{e_j}\). Since this is true for every vertex \(x \in V \setminus I\), we have \(\max_{x \in V \setminus I} \min_{y \in I} w_{x,y} \le 2w_{e_j}\), as claimed. ▨

Figure 10.4: Coming soon!

A family of tight inputs is provided by the family of complete graphs with \(n \ge 3\) vertices. Let us denote the vertices of such a graph \(G\) as \(x_0, \ldots, x_{n-1}\). Each edge \((x_0, x_i)\) of \(G\), for \(1 \le i \le n-1\), has weight \(1\). Any other edge has weight \(2\). See Figure 10.5.

Figure 10.5: Coming soon!

The parameter \(k\) is \(1\). The optimal choice of a single center is \(x_0\) because the edge \((x_0,x_i)\) has weight \(1\) for all \(1 \le i \le n-1\). For \(1 \le j < n-1\), \(G_j\) is disconnected, and so is \(G_j^2\). In particular, any any maximal independent set of \(G_j^2\) has size at least two, which proves that \(j < j^*\). For \(j = n - 1\), the graph \(G_j\) is a star, and \(G_j^2\) is a clique. Thus, any independent set of \(G_j^2\) has size \(1\). Therefore the above algorithm returns an independent set \(I\) of \(G_j^2\) in the \(n - 1\)st iteration. Since \(G_j^2\) is a clique, any one-vertex set \(I\) is a maximal independent set of \(G_j^2\). If the algorithm chooses a vertex \(x_i\) other than \(x_0\) as the vertex in this independent set, then the objective function value of this solution to the \(k\)-center problem is \(2\) because every edge \((x_i,x_j)\) with \(i \ne 0\) and \(j \ne 0\) has weight \(2\).

Exercise 10.2: We only provided a tight example for the metric \(k\)-center problem for the case when \(k = 1\). This leaves open the possibility that the algorithm achieves an approximation factor of \(1 + \frac{1}{k}\) or any other decreasing function that happens to be \(2\) for \(k = 1\). Prove that this is not the case by generalizing the tight example to arbitrary values of \(k\).

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Algorithms II

10.1. Metric \(\boldsymbol{k}\)-Center