Pruned Pivot: Correlation Clustering Algorithm for Dynamic, Parallel, and Local Computation Models
Mina Dalirrooyfard, Konstantin Makarychev, Slobodan Mitrović
TL;DR
This work introduces Pruned Pivot, a Pivot-like algorithm for correlation clustering on unweighted graphs that achieves a $3+\varepsilon$ approximation while enabling scalable implementations in dynamic, MPC, and LCA models. It develops a depth-bounded recursive formulation whose pruning yields near-optimal tradeoffs, and provides comprehensive analyses (including dangerous and expensive query-path concepts and martingale bounds) that support the approximation and efficiency claims. The paper delivers concrete algorithmic results: a fully dynamic algorithm with expected amortized update time $O(1/\varepsilon)$, an MPC algorithm with $O(\log(1/\varepsilon))$ rounds, an LCA with $O(\Delta/\varepsilon)$ probes, and a CRCW PRAM implementation in $O(1/\varepsilon)$ rounds. Experimental evaluation suggests that exploring only a small number of nodes suffices to achieve near-Pivot performance, highlighting the approach’s practicality for large, evolving graphs. Overall, the method substantially improves the scalability of correlation clustering across dynamic, parallel, and local computation models, while preserving strong approximation guarantees.
Abstract
Given a graph with positive and negative edge labels, the correlation clustering problem aims to cluster the nodes so to minimize the total number of between-cluster positive and within-cluster negative edges. This problem has many applications in data mining, particularly in unsupervised learning. Inspired by the prevalence of large graphs and constantly changing data in modern applications, we study correlation clustering in dynamic, parallel (MPC), and local computation (LCA) settings. We design an approach that improves state-of-the-art runtime complexities in all these settings. In particular, we provide the first fully dynamic algorithm that runs in an expected amortized constant time, without any dependence on the graph size. Moreover, our algorithm essentially matches the approximation guarantee of the celebrated Pivot algorithm.
