Table of Contents
Fetching ...

Correlation Clustering with Vertex Splitting

Matthias Bentert, Alex Crane, Pål Grønås Drange, Felix Reidl, Blair D. Sullivan

TL;DR

This work introduces permissive vertex splitting as a unifying operation to model overlapping clustering under uncertain data, connecting Correlation Clustering and Multicut with Vertex Splitting. It proves para-NP-hardness and $n^{1-\varepsilon}$-inapproximability for incomplete information, while in the complete-information setting it delivers a polynomial kernel of size $O(k^3)$ and a polynomial-time 7-approximation for Cluster Editing with Permissive Vertex Splitting. The results hinge on novel structures like bad-star forests and reductions that preserve the number of splits, establishing a tight complexity landscape and opening avenues for tighter kernels and better approximations. The work also clarifies the link between CCPVS and Multicut variants, providing a solid theoretical foundation for overlapping clustering under uncertain data and guiding practical algorithm design where data may be incomplete or ambiguous.

Abstract

We explore Cluster Editing and its generalization Correlation Clustering with a new operation called permissive vertex splitting which addresses finding overlapping clusters in the face of uncertain information. We determine that both problems are NP-hard, yet they exhibit significant differences in parameterized complexity and approximability. For Cluster Editing with Permissive Vertex Splitting, we show a polynomial kernel when parameterized by the solution size and develop a polynomial-time algorithm with approximation factor 7. In the case of Correlation Clustering, we establish para-NP-hardness when parameterized by solution size and demonstrate that computing an $n^{1-ε}$-approximation is NP-hard for any constant $ε> 0$. Additionally, we extend the established link between Correlation Clustering and Multicut to the setting with permissive vertex splitting.

Correlation Clustering with Vertex Splitting

TL;DR

This work introduces permissive vertex splitting as a unifying operation to model overlapping clustering under uncertain data, connecting Correlation Clustering and Multicut with Vertex Splitting. It proves para-NP-hardness and -inapproximability for incomplete information, while in the complete-information setting it delivers a polynomial kernel of size and a polynomial-time 7-approximation for Cluster Editing with Permissive Vertex Splitting. The results hinge on novel structures like bad-star forests and reductions that preserve the number of splits, establishing a tight complexity landscape and opening avenues for tighter kernels and better approximations. The work also clarifies the link between CCPVS and Multicut variants, providing a solid theoretical foundation for overlapping clustering under uncertain data and guiding practical algorithm design where data may be incomplete or ambiguous.

Abstract

We explore Cluster Editing and its generalization Correlation Clustering with a new operation called permissive vertex splitting which addresses finding overlapping clusters in the face of uncertain information. We determine that both problems are NP-hard, yet they exhibit significant differences in parameterized complexity and approximability. For Cluster Editing with Permissive Vertex Splitting, we show a polynomial kernel when parameterized by the solution size and develop a polynomial-time algorithm with approximation factor 7. In the case of Correlation Clustering, we establish para-NP-hardness when parameterized by solution size and demonstrate that computing an -approximation is NP-hard for any constant . Additionally, we extend the established link between Correlation Clustering and Multicut to the setting with permissive vertex splitting.
Paper Structure (8 sections, 11 theorems, 5 equations, 1 figure)

This paper contains 8 sections, 11 theorems, 5 equations, 1 figure.

Key Result

Lemma 1

An (incomplete) correlation graph $G$ can be clustered with $k$ vertex splits if and only if $G$ has an overlapping clustering of cost $k$.

Figures (1)

  • Figure 1: A vertex $v$ in an (incomplete) correlation graph (top). The bottom row gives toy examples of exclusive (left), inclusive (center), and permissive (right) vertex splits of $v$ into $v_1$ and $v_2$. For clarity, some red edges incident to $v_1$ and $v_2$ are omitted from each figure on the bottom row.

Theorems & Definitions (25)

  • Definition 1
  • Definition 2
  • Definition 3
  • Definition 4
  • Lemma 1
  • proof
  • Theorem 1
  • proof
  • Theorem 2
  • proof
  • ...and 15 more