Inapproximability of Maximum Diameter Clustering for Few Clusters
Henry Fleischmann, Kyrylo Karlov, Karthik C. S., Ashwin Padaki, Stepan Zharkov
TL;DR
This work investigates the Max-$k$-Diameter clustering problem, proving strong hardness of approximation results for fixed $k$ in key metric spaces. By introducing cloud systems, a flexible embedding framework from $k$-uniform hypergraphs to metric-pointsets, the authors connect coloring properties to clustering quality, enabling robust NP-hardness reductions. They construct a $3/2$-cloud system in $ ext{l}_1$ (and Hadamard-based variants) and a $1.304$-cloud system in $ ext{l}_2$, establishing NP-hardness to approximate within $3/2- ilde{O}(0)$ and $1.304$ respectively; these results extend to all $ ext{l}_p$ via embeddings and dimension-reduction arguments. The paper also delineates fundamental barriers to improving these hardness factors and discusses open problems, including tightening constants and extending techniques to related geometric optimization problems. Overall, the cloud-system framework substantially advances our understanding of Euclidean and $ ext{l}_1$ hardness in fixed-$k$ clustering objectives and suggests new directions for geometric approximation barriers and reductions.
Abstract
In the Max-k-diameter problem, we are given a set of points in a metric space, and the goal is to partition the input points into k parts such that the maximum pairwise distance between points in the same part of the partition is minimized. The approximability of the Max-k-diameter problem was studied in the eighties, culminating in the work of Feder and Greene [STOC'88], wherein they showed it is NP-hard to approximate within a factor better than 2 in the $\ell_1$ and $\ell_\infty$ metrics, and NP-hard to approximate within a factor better than 1.969 in the Euclidean metric. This complements the celebrated 2 factor polynomial time approximation algorithm for the problem in general metrics (Gonzalez [TCS'85]; Hochbaum and Shmoys [JACM'86]). Over the last couple of decades, there has been increased interest from the algorithmic community to study the approximability of various clustering objectives when the number of clusters is fixed. In this setting, the framework of coresets has yielded PTAS for most popular clustering objectives, including k-means, k-median, k-center, k-minsum, and so on. In this paper, rather surprisingly, we prove that even when k=3, the Max-k-diameter problem is NP-hard to approximate within a factor of 1.5 in the $\ell_1$-metric (and Hamming metric) and NP-hard to approximate within a factor of 1.304 in the Euclidean metric. Our main conceptual contribution is the introduction of a novel framework called cloud systems which embed hypergraphs into $\ell_p$-metric spaces such that the chromatic number of the hypergraph is related to the quality of the Max-k-diameter clustering of the embedded pointset. Our main technical contributions are the constructions of nontrivial cloud systems in the Euclidean and $\ell_1$-metrics using extremal geometric structures.
