Table of Contents
Fetching ...

Simultaneously Approximating All Norms for Massively Parallel Correlation Clustering

Nairen Cao, Shi Li, Jia Ye

TL;DR

An efficient algorithm is presented that produces a clustering that is simultaneously a $63.3-approximation for all monotone symmetric norms and a novel procedure that constructs a $12.66-approximate fractional clustering for all top-$k$ norms.

Abstract

We revisit the simultaneous approximation model for the correlation clustering problem introduced by Davies, Moseley, and Newman[DMN24]. The objective is to find a clustering that minimizes given norms of the disagreement vector over all vertices. We present an efficient algorithm that produces a clustering that is simultaneously a $63.3$-approximation for all monotone symmetric norms. This significantly improves upon the previous approximation ratio of $6348$ due to Davies, Moseley, and Newman[DMN24], which works only for $\ell_p$-norms. To achieve this result, we first reduce the problem to approximating all top-$k$ norms simultaneously, using the connection between monotone symmetric norms and top-$k$ norms established by Chakrabarty and Swamy [CS19]. Then we develop a novel procedure that constructs a $12.66$-approximate fractional clustering for all top-$k$ norms. Our $63.3$-approximation ratio is obtained by combining this with the $5$-approximate rounding algorithm by Kalhan, Makarychev, and Zhou[KMZ19]. We then demonstrate that with a loss of $ε$ in the approximation ratio, the algorithm can be adapted to run in nearly linear time and in the MPC (massively parallel computation) model with poly-logarithmic number of rounds. By allowing a further trade-off in the approximation ratio to $(359+ε)$, the number of MPC rounds can be reduced to a constant.

Simultaneously Approximating All Norms for Massively Parallel Correlation Clustering

TL;DR

An efficient algorithm is presented that produces a clustering that is simultaneously a 12.66-approximate fractional clustering for all top- norms.

Abstract

We revisit the simultaneous approximation model for the correlation clustering problem introduced by Davies, Moseley, and Newman[DMN24]. The objective is to find a clustering that minimizes given norms of the disagreement vector over all vertices. We present an efficient algorithm that produces a clustering that is simultaneously a -approximation for all monotone symmetric norms. This significantly improves upon the previous approximation ratio of due to Davies, Moseley, and Newman[DMN24], which works only for -norms. To achieve this result, we first reduce the problem to approximating all top- norms simultaneously, using the connection between monotone symmetric norms and top- norms established by Chakrabarty and Swamy [CS19]. Then we develop a novel procedure that constructs a -approximate fractional clustering for all top- norms. Our -approximation ratio is obtained by combining this with the -approximate rounding algorithm by Kalhan, Makarychev, and Zhou[KMZ19]. We then demonstrate that with a loss of in the approximation ratio, the algorithm can be adapted to run in nearly linear time and in the MPC (massively parallel computation) model with poly-logarithmic number of rounds. By allowing a further trade-off in the approximation ratio to , the number of MPC rounds can be reduced to a constant.

Paper Structure

This paper contains 33 sections, 44 theorems, 113 equations, 5 algorithms.

Key Result

Theorem 1.2

Given a correlation clustering instance $G = (V, E)$, in polynomial time we can construct a simultaneous $63.3$-approximate clustering ${\mathcal{C}}$ for the family of monotone symmetric norms.

Theorems & Definitions (80)

  • Definition 1.1
  • Theorem 1.2
  • Theorem 1.3
  • Theorem 1.4
  • Theorem 1.5
  • Lemma 2.0
  • Theorem 2.1
  • Theorem 2.2: Chernoff Bound
  • Lemma 3.1
  • proof : Proof of Theorem \ref{['thm:sequentialAlgorithm']}
  • ...and 70 more