Multilayer Correlation Clustering
Atsushi Miyauchi, Florian Adriaens, Francesco Bonchi, Nikolaj Tatti
TL;DR
This work introduces Multilayer Correlation Clustering, which extends Correlation Clustering to multiple layers by minimizing the $\ell_p$-norm of layer-wise disagreements. It develops an $O(L\log n)$-approximation via a region-growing technique on a convex-relaxation-derived pseudometric, and studies a probability-constrained variant with two constant-factor algorithms, including a 4-approximation. Theoretical guarantees are complemented by extensive experiments on real multilayer networks, showing practical effectiveness and favorable comparisons to baselines. The results advance multilayer network analysis by providing scalable, rigorous clustering that respects multiple, potentially uncertain similarity views. The work also bridges clustering, metric-embedding reductions, and fairness considerations in a unified multilayer framework.
Abstract
In this paper, we establish Multilayer Correlation Clustering, a novel generalization of Correlation Clustering (Bansal et al., FOCS '02) to the multilayer setting. In this model, we are given a series of inputs of Correlation Clustering (called layers) over the common set $V$. The goal is then to find a clustering of $V$ that minimizes the $\ell_p$-norm ($p\geq 1$) of the disagreements vector, which is defined as the vector (with dimension equal to the number of layers), each element of which represents the disagreements of the clustering on the corresponding layer. For this generalization, we first design an $O(L\log n)$-approximation algorithm, where $L$ is the number of layers, based on the well-known region growing technique. We then study an important special case of our problem, namely the problem with the probability constraint. For this case, we first give an $(α+2)$-approximation algorithm, where $α$ is any possible approximation ratio for the single-layer counterpart. For instance, we can take $α=2.5$ in general (Ailon et al., JACM '08) and $α=1.73+ε$ for the unweighted case (Cohen-Addad et al., FOCS '23). Furthermore, we design a $4$-approximation algorithm, which improves the above approximation ratio of $α+2=4.5$ for the general probability-constraint case. Computational experiments using real-world datasets demonstrate the effectiveness of our proposed algorithms.
