Table of Contents
Fetching ...

A $(2+\varepsilon)$-Approximation Algorithm for Metric $k$-Median

Vincent Cohen-Addad, Fabrizio Grandoni, Euiwoong Lee, Chris Schwiegelshohn, Ola Svensson

TL;DR

This work presents a $(2+\epsilon)-approximation algorithm for $k-median, improving the previous best-known approximation factor of $2.613", and develops a novel $(2+\epsilon)-approximation algorithm tailored for stable instances, where removing any center from an optimal solution increases the cost by at least an $\Omega(\epsilon^3/\log n)$ fraction.

Abstract

In the classical NP-hard metric $k$-median problem, we are given a set of $n$ clients and centers with metric distances between them, along with an integer parameter $k\geq 1$. The objective is to select a subset of $k$ open centers that minimizes the total distance from each client to its closest open center. In their seminal work, Jain, Mahdian, Markakis, Saberi, and Vazirani presented the Greedy algorithm for facility location, which implies a $2$-approximation algorithm for $k$-median that opens $k$ centers in expectation. Since then, substantial research has aimed at narrowing the gap between their algorithm and the best achievable approximation by an algorithm guaranteed to open exactly $k$ centers. During the last decade, all improvements have been achieved by leveraging their algorithm or a small improvement thereof, followed by a second step called bi-point rounding, which inherently increases the approximation guarantee. Our main result closes this gap: for any $ε>0$, we present a $(2+ε)$-approximation algorithm for $k$-median, improving the previous best-known approximation factor of $2.613$. Our approach builds on a combination of two algorithms. First, we present a non-trivial modification of the Greedy algorithm that operates with $O(\log n/ε^2)$ adaptive phases. Through a novel walk-between-solutions approach, this enables us to construct a $(2+ε)$-approximation algorithm for $k$-median that consistently opens at most $k + O(\log n{/ε^2})$ centers. Second, we develop a novel $(2+ε)$-approximation algorithm tailored for stable instances, where removing any center from an optimal solution increases the cost by at least an $Ω(ε^3/\log n)$ fraction. Achieving this involves a sampling approach inspired by the $k$-means++ algorithm and a reduction to submodular optimization subject to a partition matroid.

A $(2+\varepsilon)$-Approximation Algorithm for Metric $k$-Median

TL;DR

This work presents a k-median, improving the previous best-known approximation factor of (2+\epsilon)-approximation algorithm tailored for stable instances, where removing any center from an optimal solution increases the cost by at least an fraction.

Abstract

In the classical NP-hard metric -median problem, we are given a set of clients and centers with metric distances between them, along with an integer parameter . The objective is to select a subset of open centers that minimizes the total distance from each client to its closest open center. In their seminal work, Jain, Mahdian, Markakis, Saberi, and Vazirani presented the Greedy algorithm for facility location, which implies a -approximation algorithm for -median that opens centers in expectation. Since then, substantial research has aimed at narrowing the gap between their algorithm and the best achievable approximation by an algorithm guaranteed to open exactly centers. During the last decade, all improvements have been achieved by leveraging their algorithm or a small improvement thereof, followed by a second step called bi-point rounding, which inherently increases the approximation guarantee. Our main result closes this gap: for any , we present a -approximation algorithm for -median, improving the previous best-known approximation factor of . Our approach builds on a combination of two algorithms. First, we present a non-trivial modification of the Greedy algorithm that operates with adaptive phases. Through a novel walk-between-solutions approach, this enables us to construct a -approximation algorithm for -median that consistently opens at most centers. Second, we develop a novel -approximation algorithm tailored for stable instances, where removing any center from an optimal solution increases the cost by at least an fraction. Achieving this involves a sampling approach inspired by the -means++ algorithm and a reduction to submodular optimization subject to a partition matroid.

Paper Structure

This paper contains 97 sections, 34 theorems, 170 equations, 1 figure, 1 table, 3 algorithms.

Key Result

Theorem 1.1

For every $\varepsilon>0$, there is a randomized polynomial-time algorithm for $k$-median that returns a solution with cost at most $(2+\varepsilon)\text{opt}$ with high probability.

Figures (1)

  • Figure 1: How we walk from $H'_p$ to $H_p$. Dark grey color denotes a free facility and light grey color denotes regular open facilities.

Theorems & Definitions (81)

  • Theorem 1.1
  • Theorem 1.2
  • Theorem 1.3
  • Lemma 1.4
  • proof
  • Definition 1
  • Theorem 3.1
  • Lemma 3.1
  • Lemma 3.2
  • Lemma 3.3
  • ...and 71 more