A Broader View on Clustering under Cluster-Aware Norm Objectives
Martin G. Herold, Evangelos Kipouridis, Joachim Spoerhase
TL;DR
The paper studies clustering under cluster-aware norms (f,g), unifying and extending classic problems such as k-Center, k-Median, and Min-Load k-Clustering. It introduces a layered-ball reduction and proxy-cost framework, enabling primal-dual and LP-based approaches, including Lagrange multiplier preserving techniques and bi-point rounding, to achieve polylog and linear approximations. A key novelty is the attenuation parameter χ for symmetric norms, which allows fine-grained interpolation between extreme objectives and yields an overarching bound of O(min(k^{1-χ_g} log^2 n, n^{χ_f} log k, k)) for (f,g)-Clustering. Together, these results connect and interpolate among base objectives, providing near-optimal guarantees in broad regimes and offering a roadmap for further exploration of general norm-based clustering objectives.
Abstract
We revisit the $(f,g)$-clustering problem that we introduced in a recent work [SODA'25], and which subsumes fundamental clustering problems such as $k$-Center, $k$-Median, Min-Sum of Radii, and Min-Load $k$-Clustering. This problem assigns each of the $k$ clusters a cost determined by the monotone, symmetric norm $f$ applied to the vector distances in the cluster, and aims at minimizing the norm $g$ applied to the vector of cluster costs. Previously, we focused on certain special cases for which we designed constant-factor approximation algorithms. Our bounds for more general settings left, however, large gaps to the known bounds for the basic problems they capture. In this work, we provide a clearer picture of the approximability of these more general settings. First, we design an $O(\log^2 n)$-approximation algorithm for $(f, L_{1})$-clustering for any $f$. This improves upon our previous $\widetilde{O}(\sqrt{n})$-approximation. Second, we provide an $O(k)$-approximation for the general $(f,g)$-clustering problem, which improves upon our previous $\widetilde{O}(\sqrt{kn})$-approximation algorithm and matches the best-known upper bound for Min-Load $k$-Clustering. We then design an approximation algorithm for $(f,g)$-clustering that interpolates, up to polylog factors, between the best known bounds for $k$-Center, $k$-Median, Min-Sum of Radii, Min-Load $k$-Clustering, (Top, $L_{1}$)-clustering, and $(L_{\infty},g)$-clustering based on a newly defined parameter of $f$ and $g$.
