Table of Contents
Fetching ...

Dynamic Spectral Clustering with Provable Approximation Guarantee

Steinar Laenen, He Sun

TL;DR

This work addresses scalable clustering on dynamically evolving graphs where both edges and vertices are inserted and the intrinsic cluster structure may change over time.It introduces a dynamic cluster-preserving sparsifier along with a contracted-graph sketch to track clusters, enabling spectral clustering with provable guarantees under mild initial-gap conditions and controlled growth.The proposed SZ-based dynamic sparsification and the contracted-graph framework yield amortized update times of $O(1)$ and amortized query times that are $o(n_T)$, while maintaining clustering quality via eigen-gap and conductance guarantees.Experimentally, the method delivers substantial speedups over recomputing spectral clustering on the full graph with comparable accuracy on both synthetic SBM variants and real datasets like MNIST/EMNIST, validating practicality for large, evolving graphs.Overall, the paper advances dynamic clustering by coupling cluster-preserving sparsification with a compact spectral sketch that preserves the essential cluster structure under incremental growth.

Abstract

This paper studies clustering algorithms for dynamically evolving graphs $\{G_t\}$, in which new edges (and potential new vertices) are added into a graph, and the underlying cluster structure of the graph can gradually change. The paper proves that, under some mild condition on the cluster-structure, the clusters of the final graph $G_T$ of $n_T$ vertices at time $T$ can be well approximated by a dynamic variant of the spectral clustering algorithm. The algorithm runs in amortised update time $O(1)$ and query time $o(n_T)$. Experimental studies on both synthetic and real-world datasets further confirm the practicality of our designed algorithm.

Dynamic Spectral Clustering with Provable Approximation Guarantee

TL;DR

This work addresses scalable clustering on dynamically evolving graphs where both edges and vertices are inserted and the intrinsic cluster structure may change over time.It introduces a dynamic cluster-preserving sparsifier along with a contracted-graph sketch to track clusters, enabling spectral clustering with provable guarantees under mild initial-gap conditions and controlled growth.The proposed SZ-based dynamic sparsification and the contracted-graph framework yield amortized update times of $O(1)$ and amortized query times that are $o(n_T)$, while maintaining clustering quality via eigen-gap and conductance guarantees.Experimentally, the method delivers substantial speedups over recomputing spectral clustering on the full graph with comparable accuracy on both synthetic SBM variants and real datasets like MNIST/EMNIST, validating practicality for large, evolving graphs.Overall, the paper advances dynamic clustering by coupling cluster-preserving sparsification with a compact spectral sketch that preserves the essential cluster structure under incremental growth.

Abstract

This paper studies clustering algorithms for dynamically evolving graphs , in which new edges (and potential new vertices) are added into a graph, and the underlying cluster structure of the graph can gradually change. The paper proves that, under some mild condition on the cluster-structure, the clusters of the final graph of vertices at time can be well approximated by a dynamic variant of the spectral clustering algorithm. The algorithm runs in amortised update time and query time . Experimental studies on both synthetic and real-world datasets further confirm the practicality of our designed algorithm.
Paper Structure (23 sections, 19 theorems, 104 equations, 3 figures, 2 tables, 7 algorithms)

This paper contains 23 sections, 19 theorems, 104 equations, 3 figures, 2 tables, 7 algorithms.

Key Result

Theorem 1.1

Let $G_1=(V_1,E_1)$ be a graph of $n_1$ vertices and $k=\widetilde{O}(1)$ clusters.We use $\widetilde{O}(n)$ to represent $O(n \cdot\log^c (n))$ for constant $c$. Assume that new edges, which could be adjacent to new vertices, are added to $G_t$ at each time $t$ to obtain $G_{t+1}$, and there are $O

Figures (3)

  • Figure 1: Illustration of our technique. The black and red edges in Figure (a) are the edges in $G_t$ and the added ones in $G_{t'}$; the dashed black and red edges in Figure (b) are the ones added in $H_t$ and $H_{t'}$; the black and red edges in Figure (c) are the ones in $\widetilde{G}_t$ and $\widetilde{G}_{t'}$.
  • Figure 2: Results on the two versions of our dynamic SBM. Figures (a) and (b) report the average ARI score at each time $T$ for the clustering results on $G_T$, $H_T$, and $\widetilde{G}_T$; Figures (c) and (d) report the running time in seconds at each time $T$. Shaded regions indicate the standard deviation.
  • Figure 3: Results on MNIST and EMNIST. Figures (a) and (b) report the average ARI scores at each time $T$ for the clustering results on $G_T$, $H_T$, and $\widetilde{G}_T$; Figures (c) and (d) report the average cumulative running time in seconds at each time $T$. Shaded regions indicate the standard deviation.

Theorems & Definitions (40)

  • Theorem 1.1: Informal statement of Theorem \ref{['thm:main_dynamic_SC']}
  • Lemma 2.1: Higher-order Cheeger inequality, higherCheeg
  • Definition 3.1: Cluster-preserving sparsifier
  • Lemma 3.2
  • Theorem 3.3
  • Lemma 4.1
  • Lemma 4.2
  • Lemma 4.3
  • Lemma 4.4
  • Lemma 4.5
  • ...and 30 more