Table of Contents
Fetching ...

Dynamic Correlation Clustering in Sublinear Update Time

Vincent Cohen-Addad, Silvio Lattanzi, Andreas Maggiori, Nikos Parotsidis

TL;DR

The paper tackles dynamic correlation clustering in node streams, where nodes can be adversarially added and randomly deleted. It introduces a sublinear-time, constant-factor approximation built on an agreement-based framework with sampling and a Notify-based dynamic apparatus, achieving $O(\mathrm{polylog}\, n)$ amortized updates. The authors prove correctness and runtime guarantees and validate the approach with experiments on real-world graphs, showing robust performance across varying densities. This work lays a foundation for scalable clustering in evolving networks and opens questions on extending sublinear guarantees to more adversarial settings and further amortization improvements.

Abstract

We study the classic problem of correlation clustering in dynamic node streams. In this setting, nodes are either added or randomly deleted over time, and each node pair is connected by a positive or negative edge. The objective is to continuously find a partition which minimizes the sum of positive edges crossing clusters and negative edges within clusters. We present an algorithm that maintains an $O(1)$-approximation with $O$(polylog $n$) amortized update time. Prior to our work, Behnezhad, Charikar, Ma, and L. Tan achieved a $5$-approximation with $O(1)$ expected update time in edge streams which translates in node streams to an $O(D)$-update time where $D$ is the maximum possible degree. Finally we complement our theoretical analysis with experiments on real world data.

Dynamic Correlation Clustering in Sublinear Update Time

TL;DR

The paper tackles dynamic correlation clustering in node streams, where nodes can be adversarially added and randomly deleted. It introduces a sublinear-time, constant-factor approximation built on an agreement-based framework with sampling and a Notify-based dynamic apparatus, achieving amortized updates. The authors prove correctness and runtime guarantees and validate the approach with experiments on real-world graphs, showing robust performance across varying densities. This work lays a foundation for scalable clustering in evolving networks and opens questions on extending sublinear guarantees to more adversarial settings and further amortization improvements.

Abstract

We study the classic problem of correlation clustering in dynamic node streams. In this setting, nodes are either added or randomly deleted over time, and each node pair is connected by a positive or negative edge. The objective is to continuously find a partition which minimizes the sum of positive edges crossing clusters and negative edges within clusters. We present an algorithm that maintains an -approximation with (polylog ) amortized update time. Prior to our work, Behnezhad, Charikar, Ma, and L. Tan achieved a -approximation with expected update time in edge streams which translates in node streams to an -update time where is the maximum possible degree. Finally we complement our theoretical analysis with experiments on real world data.
Paper Structure (29 sections, 35 theorems, 131 equations, 4 figures, 3 tables, 9 algorithms)

This paper contains 29 sections, 35 theorems, 131 equations, 4 figures, 3 tables, 9 algorithms.

Key Result

Lemma 4

Let $\mathcal{C} = \{C_1, C_2, \dots, C_k \}$ be a clustering solution for graph $G = (V, E)$ and $\varepsilon$ a small enough constant. If the following properties hold: Then the cost of $\mathcal{C}$ is a constant factor approximation to that of the optimal correlation clustering solution for graph $G$

Figures (4)

  • Figure 1: Correlation clustering objective relative to singletons
  • Figure 2: Correlation clustering objective relative to singletons
  • Figure 3: Correlation clustering objective relative to singletons
  • Figure 4: Correlation clustering objective relative to singletons

Theorems & Definitions (83)

  • Definition 1: Database model AssasdiCC
  • Definition 2: Agreement
  • Definition 3: Heaviness
  • Lemma 4: rephrased from DBLP:conf/icml/Cohen-AddadLMNP21
  • Definition 5
  • Theorem 6
  • proof
  • proof
  • proof
  • proof
  • ...and 73 more