Correlation Clustering with Vertex Splitting
Matthias Bentert, Alex Crane, Pål Grønås Drange, Felix Reidl, Blair D. Sullivan
TL;DR
This work introduces permissive vertex splitting as a unifying operation to model overlapping clustering under uncertain data, connecting Correlation Clustering and Multicut with Vertex Splitting. It proves para-NP-hardness and $n^{1-\varepsilon}$-inapproximability for incomplete information, while in the complete-information setting it delivers a polynomial kernel of size $O(k^3)$ and a polynomial-time 7-approximation for Cluster Editing with Permissive Vertex Splitting. The results hinge on novel structures like bad-star forests and reductions that preserve the number of splits, establishing a tight complexity landscape and opening avenues for tighter kernels and better approximations. The work also clarifies the link between CCPVS and Multicut variants, providing a solid theoretical foundation for overlapping clustering under uncertain data and guiding practical algorithm design where data may be incomplete or ambiguous.
Abstract
We explore Cluster Editing and its generalization Correlation Clustering with a new operation called permissive vertex splitting which addresses finding overlapping clusters in the face of uncertain information. We determine that both problems are NP-hard, yet they exhibit significant differences in parameterized complexity and approximability. For Cluster Editing with Permissive Vertex Splitting, we show a polynomial kernel when parameterized by the solution size and develop a polynomial-time algorithm with approximation factor 7. In the case of Correlation Clustering, we establish para-NP-hardness when parameterized by solution size and demonstrate that computing an $n^{1-ε}$-approximation is NP-hard for any constant $ε> 0$. Additionally, we extend the established link between Correlation Clustering and Multicut to the setting with permissive vertex splitting.
