Incremental (k, z)-Clustering on Graphs

Emilio Cruciani; Sebastian Forster; Antonis Skarlatos

Incremental (k, z)-Clustering on Graphs

Emilio Cruciani, Sebastian Forster, Antonis Skarlatos

TL;DR

This work develops a randomized incremental framework for the graph $(k,z)$-clustering problem under adversarial edge updates. It first builds a constant-factor bicriteria approximation of size $O(k)$ and then reduces the problem to a dynamic, distance-preserving sparse instance on a center set via a dynamic spanner, solving a static $(k,z)$-clustering on this reduced instance. The primary theoretical contribution is a total update time of $ ilde{O}(k m^{1+o(1)}+ k^{1+1/\lambda} m)$ with high probability, and an amortized near-optimal time of $ ilde{O}(k n^{o(1)}+ k^{1+1/\lambda})$, for fixed $\lambda\ge 1$, while maintaining an $O(1)$-approximation to the optimal $(k,z)$-clustering. The combination of a carefully structured MP-bi variant, a leaking-set mechanism, and a dynamic spanner-based reduction yields practical, scalable dynamic clustering on graphs in the incremental setting, with potential impact on near-linear-time graph clustering in evolving networks.

Abstract

Given a weighted undirected graph, a number of clusters $k$, and an exponent $z$, the goal in the $(k, z)$-clustering problem on graphs is to select $k$ vertices as centers that minimize the sum of the distances raised to the power $z$ of each vertex to its closest center. In the dynamic setting, the graph is subject to adversarial edge updates, and the goal is to maintain explicitly an exact $(k, z)$-clustering solution in the induced shortest-path metric. While efficient dynamic $k$-center approximation algorithms on graphs exist [Cruciani et al. SODA 2024], to the best of our knowledge, no prior work provides similar results for the dynamic $(k,z)$-clustering problem. As the main result of this paper, we develop a randomized incremental $(k, z)$-clustering algorithm that maintains with high probability a constant-factor approximation in a graph undergoing edge insertions with a total update time of $\tilde O(k m^{1+o(1)}+ k^{1+\frac{1}λ} m)$, where $λ\geq 1$ is an arbitrary fixed constant. Our incremental algorithm consists of two stages. In the first stage, we maintain a constant-factor bicriteria approximate solution of size $\tilde{O}(k)$ with a total update time of $m^{1+o(1)}$ over all adversarial edge insertions. This first stage is an intricate adaptation of the bicriteria approximation algorithm by Mettu and Plaxton [Machine Learning 2004] to incremental graphs. One of our key technical results is that the radii in their algorithm can be assumed to be non-decreasing while the approximation ratio remains constant, a property that may be of independent interest. In the second stage, we maintain a constant-factor approximate $(k,z)$-clustering solution on a dynamic weighted instance induced by the bicriteria approximate solution. For this subproblem, we employ a dynamic spanner algorithm together with a static $(k,z)$-clustering algorithm.

Incremental (k, z)-Clustering on Graphs

TL;DR

This work develops a randomized incremental framework for the graph

-clustering problem under adversarial edge updates. It first builds a constant-factor bicriteria approximation of size

and then reduces the problem to a dynamic, distance-preserving sparse instance on a center set via a dynamic spanner, solving a static

-clustering on this reduced instance. The primary theoretical contribution is a total update time of

with high probability, and an amortized near-optimal time of

, for fixed

, while maintaining an

-approximation to the optimal

-clustering. The combination of a carefully structured MP-bi variant, a leaking-set mechanism, and a dynamic spanner-based reduction yields practical, scalable dynamic clustering on graphs in the incremental setting, with potential impact on near-linear-time graph clustering in evolving networks.

Abstract

Given a weighted undirected graph, a number of clusters

, and an exponent

, the goal in the

-clustering problem on graphs is to select

vertices as centers that minimize the sum of the distances raised to the power

of each vertex to its closest center. In the dynamic setting, the graph is subject to adversarial edge updates, and the goal is to maintain explicitly an exact

-clustering solution in the induced shortest-path metric. While efficient dynamic

-center approximation algorithms on graphs exist [Cruciani et al. SODA 2024], to the best of our knowledge, no prior work provides similar results for the dynamic

-clustering problem. As the main result of this paper, we develop a randomized incremental

-clustering algorithm that maintains with high probability a constant-factor approximation in a graph undergoing edge insertions with a total update time of

, where

is an arbitrary fixed constant. Our incremental algorithm consists of two stages. In the first stage, we maintain a constant-factor bicriteria approximate solution of size

with a total update time of

over all adversarial edge insertions. This first stage is an intricate adaptation of the bicriteria approximation algorithm by Mettu and Plaxton [Machine Learning 2004] to incremental graphs. One of our key technical results is that the radii in their algorithm can be assumed to be non-decreasing while the approximation ratio remains constant, a property that may be of independent interest. In the second stage, we maintain a constant-factor approximate

-clustering solution on a dynamic weighted instance induced by the bicriteria approximate solution. For this subproblem, we employ a dynamic spanner algorithm together with a static

-clustering algorithm.

Paper Structure (58 sections, 53 theorems, 48 equations, 1 figure, 4 algorithms)

This paper contains 58 sections, 53 theorems, 48 equations, 1 figure, 4 algorithms.

Introduction
Efficient $k$-clustering algorithms.
Graph setting.
Dynamic $k$-clustering algorithms.
Comparison of dynamic graphs and dynamic point sets.
Our Contributions
Comparison between $(k, z)$-clustering and $k$-center.
Structure of the Paper
Technical Overview
Basis of Our Incremental Bicriteria Approximation Algorithm
MP-bi algorithm.
Challenges in the Incremental Setting
Number of distinct sequences of the radii.
Incremental Bicriteria Approximation Algorithm
Combining Non-Increasing and Monotonicity Properties in the Incremental Setting
...and 43 more sections

Key Result

theorem 1.1

There is a randomized incremental algorithm for the $(k, z)$-clustering problem that, given a weighted undirected graph $G = (V, E, w)$ with maximum edge weight $W$ subject to edge insertions, an integer $k \geq 1$, and constants $z \geq 1, \epsilon \in (0, 1)$, maintains with high probability: and with high probability has an amortized update time of $\tilde{O} (n^{o(1)})$.

Figures (1)

Figure 1: The blue thick regions depict the $i$-th approximate ball $B_i$ for a level $i \in [0, t]$, and the larger blue vertices represent the $i$-th candidate set $S_i$. The black lines indicate distances. In the right figure, the brown lines and dots depict new shorter distances of vertices that enter $B_i$ due to the adversarial edge insertion. Left figure: The $i$-th approximate ball before the adversarial edge insertion. Right figure: After the adversarial edge insertion, some vertices enter the $i$-th approximate ball $B_i$ (brown vertices). Since the $i$-th radius $\nu_i$ is decreased (the dashed blue region indicates the old radius), some vertices enter the temporary leaking set $Z$.

Theorems & Definitions (119)

theorem 1.1
theorem 1.2
definition 3.1: $(k,z)$-clustering problem on graphs
definition 3.2: $\rho$-approximate $(k,z)$-clustering problem
definition 3.3: $(\alpha, \beta)$-bicriteria approximation
definition 3.4: $(\alpha, \beta)$-bicriteria approximate assignment
definition 3.4: weighted $(k,z)$-clustering problem on graphs
lemma 3.5: incremental $(1+\epsilon)$-approximate SSSP liu2025incremental
lemma 3.6: dubhashi2009concentration
proof
...and 109 more

Incremental (k, z)-Clustering on Graphs

TL;DR

Abstract

Incremental (k, z)-Clustering on Graphs

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (119)