Table of Contents
Fetching ...

Dimension-Free Parameterized Approximation Schemes for Hybrid Clustering

Ameet Gadekar, Tanmay Inamdar

TL;DR

This work delivers a dimension-free, randomized bicriteria FPT-Approximation Scheme for Hybrid $k$-Clustering, removing the exponential dependence on dimension while maintaining $(1+\varepsilon,1+\varepsilon)$ guarantees. The method centers on a uniform witness-sampling framework driven by the bounded algorithmic $\varepsilon$-scatter dimension, augmented with inflated radii to overcome intrinsic barriers, and provides running-time bounds that are FPT in $k$ and $\varepsilon$ for Euclidean and several structured metric spaces. Extensions cover Hybrid $(k,z)$-clustering and Hybrid Norm clustering, with a practical coreset construction for doubling metrics, collectively answering open questions from prior work. These results broaden the applicability of dimension-free clustering approaches to diverse spaces and objective families, with implications for both theory and practice in scalable clustering.

Abstract

Hybrid $k$-Clustering is a model of clustering that generalizes two of the most widely studied clustering objectives: $k$-Center and $k$-Median. In this model, given a set of $n$ points $P$, the goal is to find $k$ centers such that the sum of the $r$-distances of each point to its nearest center is minimized. The $r$-distance between two points $p$ and $q$ is defined as $\max\{d(p, q)-r, 0\}$ -- this represents the distance of $p$ to the boundary of the $r$-radius ball around $q$ if $p$ is outside the ball, and $0$ otherwise. This problem was recently introduced by Fomin et al. [APPROX 2024], who designed a $(1+\varepsilon, 1+\varepsilon)$-bicrtieria approximation that runs in time $2^{(kd/\varepsilon)^{O(1)}} \cdot n^{O(1)}$ for inputs in $\mathbb{R}^d$; such a bicriteria solution uses balls of radius $(1+\varepsilon)r$ instead of $r$, and has a cost at most $1+\varepsilon$ times the cost of an optimal solution using balls of radius $r$. In this paper we significantly improve upon this result by designing an approximation algorithm with the same bicriteria guarantee, but with running time that is FPT only in $k$ and $\varepsilon$ -- crucially, removing the exponential dependence on the dimension $d$. This resolves an open question posed in their paper. Our results extend further in several directions. First, our approximation scheme works in a broader class of metric spaces, including doubling spaces, minor-free, and bounded treewidth metrics. Secondly, our techniques yield a similar bicriteria FPT-approximation schemes for other variants of Hybrid $k$-Clustering, e.g., when the objective features the sum of $z$-th power of the $r$-distances. Finally, we also design a coreset for Hybrid $k$-Clustering in doubling spaces, answering another open question from the work of Fomin et al.

Dimension-Free Parameterized Approximation Schemes for Hybrid Clustering

TL;DR

This work delivers a dimension-free, randomized bicriteria FPT-Approximation Scheme for Hybrid -Clustering, removing the exponential dependence on dimension while maintaining guarantees. The method centers on a uniform witness-sampling framework driven by the bounded algorithmic -scatter dimension, augmented with inflated radii to overcome intrinsic barriers, and provides running-time bounds that are FPT in and for Euclidean and several structured metric spaces. Extensions cover Hybrid -clustering and Hybrid Norm clustering, with a practical coreset construction for doubling metrics, collectively answering open questions from prior work. These results broaden the applicability of dimension-free clustering approaches to diverse spaces and objective families, with implications for both theory and practice in scalable clustering.

Abstract

Hybrid -Clustering is a model of clustering that generalizes two of the most widely studied clustering objectives: -Center and -Median. In this model, given a set of points , the goal is to find centers such that the sum of the -distances of each point to its nearest center is minimized. The -distance between two points and is defined as -- this represents the distance of to the boundary of the -radius ball around if is outside the ball, and otherwise. This problem was recently introduced by Fomin et al. [APPROX 2024], who designed a -bicrtieria approximation that runs in time for inputs in ; such a bicriteria solution uses balls of radius instead of , and has a cost at most times the cost of an optimal solution using balls of radius . In this paper we significantly improve upon this result by designing an approximation algorithm with the same bicriteria guarantee, but with running time that is FPT only in and -- crucially, removing the exponential dependence on the dimension . This resolves an open question posed in their paper. Our results extend further in several directions. First, our approximation scheme works in a broader class of metric spaces, including doubling spaces, minor-free, and bounded treewidth metrics. Secondly, our techniques yield a similar bicriteria FPT-approximation schemes for other variants of Hybrid -Clustering, e.g., when the objective features the sum of -th power of the -distances. Finally, we also design a coreset for Hybrid -Clustering in doubling spaces, answering another open question from the work of Fomin et al.
Paper Structure (13 sections, 14 theorems, 2 equations, 2 algorithms)

This paper contains 13 sections, 14 theorems, 2 equations, 2 algorithms.

Key Result

Theorem 1

There exists a randomized algorithm, that, given an instance of Hybrid $k$-Clustering in $\mathbb{R}^d$ for any dimension $d$, runs in time $2^{O(*){k \log k \cdot (1/\varepsilon^5) \log^2 (1/\varepsilon)}} \cdot n^{\mathcal{O}(1)}$, and returns a $(1+\varepsilon, 1+\varepsilon)$-bicrtieria approxi

Theorems & Definitions (20)

  • Theorem 1: Bicriteria FPT-AS for Euclidean Spaces
  • Theorem 2: Informal version of \ref{['thm:main']}
  • Theorem 3: Coreset for
  • Definition 4: Ball Intersection Problem
  • Definition 5: Algorithmic $\varepsilon$-Scatter Dimension
  • Lemma 5
  • Lemma 7
  • Lemma 8
  • Theorem 9: Main Theorem
  • Lemma 10: Feasible upper bounds
  • ...and 10 more