The Computational Complexity of Almost Stable Clustering with Penalties
Kamyar Khodamoradi, Farnam Mansouri, Sandra Zilles
TL;DR
The paper advances the complexity landscape of stable clustering by analyzing generalized stability notions for $k$-Means and $k$-Median, including penalized variants, in metrics with bounded doubling dimension. It proves polynomial-time solvability for $(1+\varepsilon')$-stable instances on doubling metrics (with and without penalties) using an enhanced $\rho$-swap local search and penalty augmentation, while simultaneously establishing ETH-based super-polynomial lower bounds for almost-stable $(\alpha,\beta)$-stable instances in Euclidean and doubling spaces. The hardness results rely on reductions from Grid Tiling Inequality and Partial Vertex Cover via moment-curve constructions, illustrating a clear separation between exact solvability under strong stability and near-stable regimes. These results refine the understanding of when structure in input data enables efficient clustering and highlight the limitations of stability-based approaches in broader regimes, culminating in an open question about the existence of $1+\varepsilon$-approximations for almost-stable instances.
Abstract
We investigate the complexity of stable (or perturbation-resilient) instances of $\mathrm{k-M\small{EANS}}$ and $\mathrm{k-M\small{EDIAN}}$ clustering problems in metrics with small doubling dimension. While these problems have been extensively studied under multiplicative perturbation resilience in low-dimensional Euclidean spaces (e.g., (Friggstad et al., 2019; Cohen-Addad and Schwiegelshohn, 2017)), we adopt a more general notion of stability, termed ``almost stable'', which is closer to the notion of $(α, \varepsilon)$-perturbation resilience introduced by Balcan and Liang (2016). Additionally, we extend our results to $\mathrm{k-M\small{EANS}}$/$\mathrm{k-M\small{EDIAN}}$ with penalties, where each data point is either assigned to a cluster centre or incurs a penalty. We show that certain special cases of almost stable $\mathrm{k-M\small{EANS}}$/$\mathrm{k-M\small{EDIAN}}$ (with penalties) are solvable in polynomial time. To complement this, we also examine the hardness of almost stable instances and $(1 + \frac{1}{poly(n)})$-stable instances of $\mathrm{k-M\small{EANS}}$/$\mathrm{k-M\small{EDIAN}}$ (with penalties), proving super-polynomial lower bounds on the runtime of any exact algorithm under the widely believed Exponential Time Hypothesis (ETH).
