Table of Contents
Fetching ...

A Broader View on Clustering under Cluster-Aware Norm Objectives

Martin G. Herold, Evangelos Kipouridis, Joachim Spoerhase

TL;DR

The paper studies clustering under cluster-aware norms (f,g), unifying and extending classic problems such as k-Center, k-Median, and Min-Load k-Clustering. It introduces a layered-ball reduction and proxy-cost framework, enabling primal-dual and LP-based approaches, including Lagrange multiplier preserving techniques and bi-point rounding, to achieve polylog and linear approximations. A key novelty is the attenuation parameter χ for symmetric norms, which allows fine-grained interpolation between extreme objectives and yields an overarching bound of O(min(k^{1-χ_g} log^2 n, n^{χ_f} log k, k)) for (f,g)-Clustering. Together, these results connect and interpolate among base objectives, providing near-optimal guarantees in broad regimes and offering a roadmap for further exploration of general norm-based clustering objectives.

Abstract

We revisit the $(f,g)$-clustering problem that we introduced in a recent work [SODA'25], and which subsumes fundamental clustering problems such as $k$-Center, $k$-Median, Min-Sum of Radii, and Min-Load $k$-Clustering. This problem assigns each of the $k$ clusters a cost determined by the monotone, symmetric norm $f$ applied to the vector distances in the cluster, and aims at minimizing the norm $g$ applied to the vector of cluster costs. Previously, we focused on certain special cases for which we designed constant-factor approximation algorithms. Our bounds for more general settings left, however, large gaps to the known bounds for the basic problems they capture. In this work, we provide a clearer picture of the approximability of these more general settings. First, we design an $O(\log^2 n)$-approximation algorithm for $(f, L_{1})$-clustering for any $f$. This improves upon our previous $\widetilde{O}(\sqrt{n})$-approximation. Second, we provide an $O(k)$-approximation for the general $(f,g)$-clustering problem, which improves upon our previous $\widetilde{O}(\sqrt{kn})$-approximation algorithm and matches the best-known upper bound for Min-Load $k$-Clustering. We then design an approximation algorithm for $(f,g)$-clustering that interpolates, up to polylog factors, between the best known bounds for $k$-Center, $k$-Median, Min-Sum of Radii, Min-Load $k$-Clustering, (Top, $L_{1}$)-clustering, and $(L_{\infty},g)$-clustering based on a newly defined parameter of $f$ and $g$.

A Broader View on Clustering under Cluster-Aware Norm Objectives

TL;DR

The paper studies clustering under cluster-aware norms (f,g), unifying and extending classic problems such as k-Center, k-Median, and Min-Load k-Clustering. It introduces a layered-ball reduction and proxy-cost framework, enabling primal-dual and LP-based approaches, including Lagrange multiplier preserving techniques and bi-point rounding, to achieve polylog and linear approximations. A key novelty is the attenuation parameter χ for symmetric norms, which allows fine-grained interpolation between extreme objectives and yields an overarching bound of O(min(k^{1-χ_g} log^2 n, n^{χ_f} log k, k)) for (f,g)-Clustering. Together, these results connect and interpolate among base objectives, providing near-optimal guarantees in broad regimes and offering a roadmap for further exploration of general norm-based clustering objectives.

Abstract

We revisit the -clustering problem that we introduced in a recent work [SODA'25], and which subsumes fundamental clustering problems such as -Center, -Median, Min-Sum of Radii, and Min-Load -Clustering. This problem assigns each of the clusters a cost determined by the monotone, symmetric norm applied to the vector distances in the cluster, and aims at minimizing the norm applied to the vector of cluster costs. Previously, we focused on certain special cases for which we designed constant-factor approximation algorithms. Our bounds for more general settings left, however, large gaps to the known bounds for the basic problems they capture. In this work, we provide a clearer picture of the approximability of these more general settings. First, we design an -approximation algorithm for -clustering for any . This improves upon our previous -approximation. Second, we provide an -approximation for the general -clustering problem, which improves upon our previous -approximation algorithm and matches the best-known upper bound for Min-Load -Clustering. We then design an approximation algorithm for -clustering that interpolates, up to polylog factors, between the best known bounds for -Center, -Median, Min-Sum of Radii, Min-Load -Clustering, (Top, )-clustering, and -clustering based on a newly defined parameter of and .

Paper Structure

This paper contains 32 sections, 26 theorems, 91 equations, 4 figures, 1 table.

Key Result

Theorem 1

There is a factor-$O(\log^2{n})$ approximation for $(\textsf{Sym}, \mathcal{L}_{1} )$-Clustering.

Figures (4)

  • Figure 1: Landscape of approximability mapping any instance of $(f,g)$-Clustering to a point $(\chi_f,\chi_g)\in[0,1]^2$. Black dots show the basic clustering problems $k$-Center, $k$-Median, Min-Sum of Radii, Min-Load $k$-Clustering, $k$-Means, and $(k,z)$-clustering. Any instance can be approximated within factor $\widetilde{O}(\min\{k^{1-\chi_g}, n^{\chi_f})\})$. The gray boundaries of the square represent $(\textsf{Sym},\mathcal{L}_{1})$-Clustering and $(\mathcal{L}_{\infty},\textsf{Sym})$-Clustering, which admit polylogarithmic ratios. The lower-right gray shaded triangle depicts a hypothetical polynomial hardness region assuming $o(k)$-hardness of approximating compact instances of Min-Load $k$-Clustering. Notice that this region is bounded by a diagonal that is densely populated with $(k,z)$-Clustering instances for $z\geq 1$ and for which constant factors are known.
  • Figure 2: LP for facility-location Ball $k$-median and fixed $\Delta$, $\Gamma$.
  • Figure 3: Dual-LP for Figure \ref{['fig:FLLP']}.
  • Figure 4: LP to decide which facilities from $X_1,X_2$ to open.

Theorems & Definitions (67)

  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Theorem 4
  • Definition 6: $(f,g)$-Clustering
  • Definition 7: $(I,O)$-Clustering
  • Definition 8: top-$\ell$ norm
  • Definition 9: ordered norm
  • Definition 10
  • Theorem 15
  • ...and 57 more