Understanding the Cluster LP for Correlation Clustering

Nairen Cao; Vincent Cohen-Addad; Euiwoong Lee; Shi Li; Alantha Newman; Lukas Vogl

Understanding the Cluster LP for Correlation Clustering

Nairen Cao, Vincent Cohen-Addad, Euiwoong Lee, Shi Li, Alantha Newman, Lukas Vogl

TL;DR

The paper introduces the cluster LP as a unifying, strong linear programming relaxation for Correlation Clustering, enabling a single framework that subsumes prior LP and rounding approaches. It develops two complementary rounding schemes—cluster-based and pivot-based with correlated rounding—augmented by preclustering to isolate a small, uncertain part and control additive errors. Through a combination of analytical triangle-based budgeting and a computer-assisted factor-revealing SDP, the authors achieve a 1.485+ε-approximation and, with a more analytical proof, a 1.56+ε bound; they also establish a $4/3-ε$ integrality gap for the cluster LP and a hardness of $24/23-ε$, sharpening the landscape of approximation limits. The results advance both algorithmic guarantees and hardness evidence for Correlation Clustering, with implications for how powerful LP/SDP relaxations can be when paired with carefully designed rounding and preconditioning techniques. The work provides a practical routing to near-optimal clustering in complete graphs with signed edges, and a solid theoretical benchmark for future improvements and hardness results.

Abstract

In the classic Correlation Clustering problem introduced by Bansal, Blum, and Chawla (FOCS 2002), the input is a complete graph where edges are labeled either $+$ or $-$, and the goal is to find a partition of the vertices that minimizes the sum of the +edges across parts plus the sum of the -edges within parts. In recent years, Chawla, Makarychev, Schramm and Yaroslavtsev (STOC 2015) gave a 2.06-approximation by providing a near-optimal rounding of the standard LP, and Cohen-Addad, Lee, Li, and Newman (FOCS 2022, 2023) finally bypassed the integrality gap of 2 for this LP giving a $1.73$-approximation for the problem. In order to create a simple and unified framework for Correlation Clustering similar to those for typical approximate optimization tasks, we propose the cluster LP as a strong linear program for Correlation Clustering. We demonstrate the power of the cluster LP by presenting new rounding algorithms, and providing two analyses, one analytically proving a 1.56-approximation and the other solving a factor-revealing SDP to show a 1.485-approximation. Both proofs introduce principled methods by which to analyze the performance of the algorithm, resulting in a significantly improved approximation guarantee. Finally, we prove an integrality gap of $4/3$ for the cluster LP, showing our 1.485-upper bound cannot be drastically improved. Our gap instance directly inspires an improved NP-hardness of approximation with a ratio $24/23 \approx 1.042$; no explicit hardness ratio was known before.

Understanding the Cluster LP for Correlation Clustering

TL;DR

integrality gap for the cluster LP and a hardness of

, sharpening the landscape of approximation limits. The results advance both algorithmic guarantees and hardness evidence for Correlation Clustering, with implications for how powerful LP/SDP relaxations can be when paired with carefully designed rounding and preconditioning techniques. The work provides a practical routing to near-optimal clustering in complete graphs with signed edges, and a solid theoretical benchmark for future improvements and hardness results.

Abstract

In the classic Correlation Clustering problem introduced by Bansal, Blum, and Chawla (FOCS 2002), the input is a complete graph where edges are labeled either

, and the goal is to find a partition of the vertices that minimizes the sum of the +edges across parts plus the sum of the -edges within parts. In recent years, Chawla, Makarychev, Schramm and Yaroslavtsev (STOC 2015) gave a 2.06-approximation by providing a near-optimal rounding of the standard LP, and Cohen-Addad, Lee, Li, and Newman (FOCS 2022, 2023) finally bypassed the integrality gap of 2 for this LP giving a

-approximation for the problem. In order to create a simple and unified framework for Correlation Clustering similar to those for typical approximate optimization tasks, we propose the cluster LP as a strong linear program for Correlation Clustering. We demonstrate the power of the cluster LP by presenting new rounding algorithms, and providing two analyses, one analytically proving a 1.56-approximation and the other solving a factor-revealing SDP to show a 1.485-approximation. Both proofs introduce principled methods by which to analyze the performance of the algorithm, resulting in a significantly improved approximation guarantee. Finally, we prove an integrality gap of

for the cluster LP, showing our 1.485-upper bound cannot be drastically improved. Our gap instance directly inspires an improved NP-hardness of approximation with a ratio

; no explicit hardness ratio was known before.

Paper Structure (57 sections, 43 theorems, 158 equations, 6 algorithms)

This paper contains 57 sections, 43 theorems, 158 equations, 6 algorithms.

Introduction
Our Results
Further Related Work
Overview and Algorithm
Formulating and Solving Cluster LP
Previous Algorithms.
Sherali-Adams and Correlated Rounding.
Preclustering.
Solving Cluster LP on Preclustered Instance by Sampling.
Rounding Cluster LP
Analysis of Cluster-Based Rounding Procedure.
Notations and Analysis for Pivot-Based Rounding Procedure.
Global Triangle Distributions.
Gaps and Hardness
Organization.
...and 42 more sections

Key Result

Theorem 1

Let $\varepsilon > 0$ be a small enough constant and ${\mathrm{opt}}$ be the cost of the optimum solution to the given Correlation Clustering instance. In time $n^{{\mathrm{poly}}(1/\varepsilon)}$, we can output a solution $((z_S)_{S \subseteq V},(x_{uv})_{uv \in {V \choose 2}})$ to the cluster LP w

Theorems & Definitions (105)

Theorem 1
Theorem 2
Theorem 3
Theorem 4
Theorem 5
Lemma 6
proof
Lemma 7
proof
Lemma 8
...and 95 more

Understanding the Cluster LP for Correlation Clustering

TL;DR

Abstract

Understanding the Cluster LP for Correlation Clustering

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (105)