Table of Contents
Fetching ...

Harnessing Multiple Correlated Networks for Exact Community Recovery

Miklós Z. Rácz, Jifan Zhang

TL;DR

The main result derives the precise information-theoretic threshold for exact community recovery using any constant number of correlated graphs, answering a question of Gaudio, R\'acz, and Sridhar (COLT 2022).

Abstract

We study the problem of learning latent community structure from multiple correlated networks, focusing on edge-correlated stochastic block models with two balanced communities. Recent work of Gaudio, Rácz, and Sridhar (COLT 2022) determined the precise information-theoretic threshold for exact community recovery using two correlated graphs; in particular, this showcased the subtle interplay between community recovery and graph matching. Here we study the natural setting of more than two graphs. The main challenge lies in understanding how to aggregate information across several graphs when none of the pairwise latent vertex correspondences can be exactly recovered. Our main result derives the precise information-theoretic threshold for exact community recovery using any constant number of correlated graphs, answering a question of Gaudio, Rácz, and Sridhar (COLT 2022). In particular, for every $K \geq 3$ we uncover and characterize a region of the parameter space where exact community recovery is possible using $K$ correlated graphs, even though (1) this is information-theoretically impossible using any $K-1$ of them and (2) none of the latent matchings can be exactly recovered.

Harnessing Multiple Correlated Networks for Exact Community Recovery

TL;DR

The main result derives the precise information-theoretic threshold for exact community recovery using any constant number of correlated graphs, answering a question of Gaudio, R\'acz, and Sridhar (COLT 2022).

Abstract

We study the problem of learning latent community structure from multiple correlated networks, focusing on edge-correlated stochastic block models with two balanced communities. Recent work of Gaudio, Rácz, and Sridhar (COLT 2022) determined the precise information-theoretic threshold for exact community recovery using two correlated graphs; in particular, this showcased the subtle interplay between community recovery and graph matching. Here we study the natural setting of more than two graphs. The main challenge lies in understanding how to aggregate information across several graphs when none of the pairwise latent vertex correspondences can be exactly recovered. Our main result derives the precise information-theoretic threshold for exact community recovery using any constant number of correlated graphs, answering a question of Gaudio, Rácz, and Sridhar (COLT 2022). In particular, for every we uncover and characterize a region of the parameter space where exact community recovery is possible using correlated graphs, even though (1) this is information-theoretically impossible using any of them and (2) none of the latent matchings can be exactly recovered.

Paper Structure

This paper contains 40 sections, 43 theorems, 183 equations, 4 figures, 5 algorithms.

Key Result

Theorem 1

Fix constants $a, b > 0$ and $s \in [0,1]$, and let $(G_{1}, G_{2}, \ldots, G_{K}) \sim \mathrm{CSBM}( n, a \frac{\log n}{n}, b \frac{\log n}{n}, s)$. Suppose that the following two conditions both hold: and Then exact community recovery is possible. That is, there is an estimator $\widehat{\boldsymbol{\sigma}} = \widehat{\boldsymbol{\sigma}}(G_{1}, G_{2},\ldots,G_{K})$ such that $\lim\limits_{n

Figures (4)

  • Figure 1: Schematic showing the construction of multiple correlated SBMs (see text for details).
  • Figure 2: Phase diagram for exact community recovery for three graphs with fixed $s$, and $a \in [0, 40]$, $b \in[0, 40]$ on the axes. Green region: exact community recovery is possible from $G_1$ alone; Cyan region: exact community recovery is impossible from $G_1$ alone, but exact graph matching of $G_1$ and $G_2$ is possible, and subsequently exact community recovery is possible from $(G_1, G_2)$; Dark Blue region: exact community recovery is impossible from $G_1$ alone, exact graph matching is also impossible from $(G_{1}, G_{2})$, yet exact community recovery is possible from $(G_1, G_2)$; Pink region: exact community recovery is impossible from $(G_1, G_2)$ (even though it would be possible if $\pi^*_{12}$ were known), yet exact community recovery is possible from $(G_1,G_2,G_3)$; Violet region: exact community recovery is impossible from $(G_1, G_2,G_3)$ (even though it would be possible from $(G_1,G_2)$ if $\pi^*_{12}$ were known); Light Green region: exact community recovery is impossible from $(G_1,G_2)$, but exact graph matching of graph pairs is possible, and subsequently exact community recovery is possible from $(G_1, G_2, G_3)$; Grey region: exact community recovery is impossible from $(G_1,G_2)$, exact graph matching is also impossible from $(G_{1}, G_{2})$, but exact graph matching is possible from $(G_1,G_2,G_3)$, and subsequently exact community recovery is possible from $(G_1, G_2, G_3)$; Yellow region: exact community recovery is impossible from $(G_1,G_2)$, exact graph matching is impossible from $(G_1,G_2,G_3)$, yet exact community recovery is possible from $(G_1, G_2, G_3)$; Orange region: exact community recovery is impossible from $(G_1, G_2,G_3)$ (even though it would be possible from $(G_1,G_2,G_3)$ if $\bm{\pi^*}$ were known); Red region: exact community recovery is impossible from $(G_1, G_2,G_3)$ (even if $\bm{\pi^*}$ is known). The principal finding of this paper is the characterization of the Pink, Violet, Orange, Yellow, Grey, and Light Green regions.
  • Figure 3: Schematic landscape of partial matchings over three graphs.
  • Figure 4: Schematic showing the meta graph $\mathcal{MG}_v$ when $K=5$.

Theorems & Definitions (93)

  • Theorem 1: Exact community recovery from $K$ correlated SBMs
  • Theorem 2: Impossibility of exact community recovery
  • Theorem 3: Exact graph matching from $K$ correlated SBMs
  • Theorem 4: Impossibility of exact graph matching from $K$ correlated SBMs
  • Definition 2.1
  • Definition 2.2
  • Definition 2.3
  • Lemma 3.1
  • proof
  • Lemma 3.2
  • ...and 83 more