Clustering with Few Disks to Minimize the Sum of Radii
Mikkel Abrahamsen, Sarita de Berg, Lucas Meijer, André Nusser, Leonidas Theocharous
TL;DR
This work investigates the $k$-MinSumRadius clustering problem in the plane and beyond, focusing on small $k$ where practical exact algorithms are attainable. The authors reveal a surprisingly simple separator structure: in an optimal solution there exists a line from a linear set of candidates that separates one cluster from the rest, enabling efficient evaluation via dynamic minimum enclosing disks/balls. They achieve near-linear time for $2$-MinSumRadius in the plane with $O(n \log^2 n \log^2 \log n)$ expected time and extend to constant dimensions with $O\left(n^{2-1/(\lceil d/2\rceil+1)+\varepsilon}\right)$ time; they also obtain a near-quadratic plane algorithm for $3$-MinSumRadius with $O(n^2 \log^2 n \log^2 \log n)$ expected time. The core idea—limiting the search to a linear set of separators and dynamically maintaining MECs—offers a path toward more scalable exact clustering in geometric settings and highlights separator-based techniques for related problems.
Abstract
Given a set of $n$ points in the Euclidean plane, the $k$-MinSumRadius problem asks to cover this point set using $k$ disks with the objective of minimizing the sum of the radii of the disks. After a long line of research on related problems, it was finally discovered that this problem admits a polynomial time algorithm [GKKPV~'12]; however, the running time of this algorithm is $O(n^{881})$, and its relevance is thereby mostly of theoretical nature. A practically and structurally interesting special case of the $k$-MinSumRadius problem is that of small $k$. For the $2$-MinSumRadius problem, a near-quadratic time algorithm with expected running time $O(n^2 \log^2 n \log^2 \log n)$ was given over 30 years ago [Eppstein~'92]. We present the first improvement of this result, namely, a near-linear time algorithm to compute the $2$-MinSumRadius that runs in expected $O(n \log^2 n \log^2 \log n)$ time. We generalize this result to any constant dimension $d$, for which we give an $O(n^{2-1/(\lceil d/2\rceil + 1) + \varepsilon})$ time algorithm. Additionally, we give a near-quadratic time algorithm for $3$-MinSumRadius in the plane that runs in expected $O(n^2 \log^2 n \log^2 \log n)$ time. All of these algorithms rely on insights that uncover a surprisingly simple structure of optimal solutions: we can specify a linear number of lines out of which one separates one of the clusters from the remaining clusters in an optimal solution.
