Table of Contents
Fetching ...

A Simple and Fast $(3+\varepsilon)$-approximation for Constrained Correlation Clustering

Nate Veldt

TL;DR

This work resolves an open question on Constrained Correlation Clustering by achieving a $(3+\varepsilon)$-approximation in $ ilde{O}(n^3)$ time. It introduces a novel covering LP with HEAP constraints, constructs a pivot-safe auxiliary graph $\hat{G}$, and leverages a Pivot-based rounding to obtain the tight factor with a simple, combinatorial procedure. The approach improves upon prior $ ilde{O}(n^3)$-time methods that yielded a 16-approximation, matching the best-known 3-approximation in a significantly faster and simpler framework. The paper also provides streamlined algorithms for the one-sided cases (FriendlyCC and HostileCC), broadening practical applicability while preserving the approximation guarantees.

Abstract

In Constrained Correlation Clustering, the goal is to cluster a complete signed graph in a way that minimizes the number of negative edges inside clusters plus the number of positive edges between clusters, while respecting hard constraints on how to cluster certain friendly or hostile node pairs. Fischer et al. [FKKT25a] recently developed a $\tilde{O}(n^3)$-time 16-approximation algorithm for this problem. We settle an open question posed by these authors by designing an algorithm that is equally fast but brings the approximation factor down to $(3+\varepsilon)$ for arbitrary constant $\varepsilon > 0$. Although several new algorithmic steps are needed to obtain our improved approximation, our approach maintains many advantages in terms of simplicity. In particular, it relies mainly on rounding a (new) covering linear program, which can be approximated quickly and combinatorially. Furthermore, the rounding step amounts to applying the very familiar Pivot algorithm to an auxiliary graph. Finally, we develop much simpler algorithms for instances that involve only friendly or only hostile constraints.

A Simple and Fast $(3+\varepsilon)$-approximation for Constrained Correlation Clustering

TL;DR

This work resolves an open question on Constrained Correlation Clustering by achieving a -approximation in time. It introduces a novel covering LP with HEAP constraints, constructs a pivot-safe auxiliary graph , and leverages a Pivot-based rounding to obtain the tight factor with a simple, combinatorial procedure. The approach improves upon prior -time methods that yielded a 16-approximation, matching the best-known 3-approximation in a significantly faster and simpler framework. The paper also provides streamlined algorithms for the one-sided cases (FriendlyCC and HostileCC), broadening practical applicability while preserving the approximation guarantees.

Abstract

In Constrained Correlation Clustering, the goal is to cluster a complete signed graph in a way that minimizes the number of negative edges inside clusters plus the number of positive edges between clusters, while respecting hard constraints on how to cluster certain friendly or hostile node pairs. Fischer et al. [FKKT25a] recently developed a -time 16-approximation algorithm for this problem. We settle an open question posed by these authors by designing an algorithm that is equally fast but brings the approximation factor down to for arbitrary constant . Although several new algorithmic steps are needed to obtain our improved approximation, our approach maintains many advantages in terms of simplicity. In particular, it relies mainly on rounding a (new) covering linear program, which can be approximated quickly and combinatorially. Furthermore, the rounding step amounts to applying the very familiar Pivot algorithm to an auxiliary graph. Finally, we develop much simpler algorithms for instances that involve only friendly or only hostile constraints.

Paper Structure

This paper contains 16 sections, 10 theorems, 57 equations, 6 figures, 7 algorithms.

Key Result

Lemma 2.3

If $G = (V,E^+, E^-, F,H)$ is the consistent form of $G_0 = (V,E^+_0, E^-_0, F,H)$, then for every $\alpha \geq 1$, an $\alpha$-approximate clustering for $G$ is an $\alpha$-approximate clustering for $G_0$.

Figures (6)

  • Figure 1: Three LP relaxations for Correlation Clustering. $\mathcal{T}{}(G)$ represents the set of bad triangles in $G = (V,E^+, E^-)$: node triplets $\{i,j,k\}$ where two edges are positive and one edge is negative. Algorithms that rely on solving and rounding variants of the Charging LP tend to be particularly fast and simple.
  • Figure 2: Bad triangle and dangerous pair.
  • Figure 3: A structure leading to a HEAP constraint $x_{ab} + x_{bc} + x_{\rho(ac)} \geq 1$.
  • Figure 4: (a) We display an instance of ConstrainedCC with four supernodes (gray convex sets) $A = \{1,2\}$, $B = \{3,4\}$, $C = \{5,6,7\}$, and $D = \{8\}$. Solid lines indicate positive edges. For easier visualization, the absence of a line between two nodes indicates a negative edge. Supernodes $A$ and $B$ are hostile since $(2,3) \in H$ (dashed red line). (b) Green lines indicate a maximal edge-disjoint set of dangerous pairs $\mathcal{D}$, where different line styles (solid, dotted, dashed) are used to indicate how edges are paired. (c) In constructing $\hat{G}$, four edges from $E_\mathcal{D}$ are flipped from positive to negative because of the first case considered when constructing $\hat{G}$, since $E^+(A,B)$, $E^+(A,D)$, $E^+(B,D)$, and $E^+(B,C)$ are all subsets of $E_\mathcal{D}$. Edges $\{(2,8), (3,8)\}$ are in class $E^\texttt{b}$, and $\{(3,6), (4,5)\}$ are in class $E^\texttt{o}$ (see Section \ref{['sec:classes']}). (d) For remaining pairs of supernodes, edges are determined based on LP values. For this example, $X_{AC}^+ = X_{CD}^+ = 0$, so all edges from $A$ to $C$ and all edges from $C$ to $D$ become positive. The resulting auxiliary graph $\hat{G}$ is pivot-safe. (e)-(f) For illustration, if node 2 is the first pivot, followed by nodes 8 and then 4, this results in a feasible clustering for instance $G$ with three clusters.
  • Figure 5: We illustrate the 5 edge classes in $G$, and resulting types of bad triangles in $\hat{G}$.
  • ...and 1 more figures

Theorems & Definitions (22)

  • Definition 2.1: Consistent form
  • Definition 2.2
  • Lemma 2.3
  • proof
  • Definition 2.4: Pivot-safe
  • Lemma 2.5: Lemma 3.1 vanzuylen2009deterministic
  • proof
  • Theorem 3.1: Theorems 13 & 14 in fischer2025faster
  • Lemma 3.2
  • proof
  • ...and 12 more