Table of Contents
Fetching ...

Phase Transition for Stochastic Block Model with more than $\sqrt{n}$ Communities (II)

Alexandra Carpentier, Christophe Giraud, Nicolas Verzelen

TL;DR

This work extends the understanding of community recovery in the SBM to the regime $K \ge \sqrt{n}$ by proving Conjecture 1.4 of Carpentier et al. through a motif-based approach. It introduces a blow-up cycle motif with fasteners, constructs a robust motif-counting estimator using Median-of-Means, and derives explicit mean-variance formulas under conditional distributions to ensure reliable recovery above a new density-threshold. The results indicate that, in moderately sparse regimes, recovery relies on combinatorial motif counting rather than spectral methods, clarifying the computational barrier and guiding algorithm design in the many-communities setting. Overall, the paper completes the computational-barrier picture for SBM with large numbers of communities and provides a pathway to polynomial-time recovery in regimes previously thought hard.

Abstract

A fundamental theoretical question in network analysis is to determine under which conditions community recovery is possible in polynomial time in the Stochastic Block Model (SBM). When the number $K$ of communities remains smaller than $\sqrt{n}$ --where $n$ denotes the number of nodes--, non-trivial community recovery is possible in polynomial time above, and only above, the Kesten--Stigum (KS) threshold, originally postulated using arguments from statistical physics. When $K \geq \sqrt{n}$, Chin, Mossel, Sohn, and Wein recently proved that, in the \emph{sparse regime}, community recovery in polynomial time is achievable below the KS threshold by counting non-backtracking paths. This finding led them to postulate a new threshold for the many-communities regime $K \geq \sqrt{n}$. Subsequently, Carpentier, Giraud, and Verzelen established the failure of low-degree polynomials below this new threshold across all density regimes, and demonstrated successful recovery above the threshold in certain moderately sparse settings. While these results provide strong evidence that, in the many community setting, the computational barrier lies at the threshold proposed in~Chin et al., the question of achieving recovery above this threshold still remains open in most density regimes. The present work is a follow-up to~Carpentier et al., in which we prove Conjecture~1.4 stated therein by: \\ 1- Constructing a family of motifs satisfying specific structural properties; and\\ 2- Proving that community recovery is possible above the proposed threshold by counting such motifs.\\ Our results complete the picture of the computational barrier for community recovery in the SBM with $K \geq \sqrt{n}$ communities. They also indicate that, in moderately sparse regimes, the optimal algorithms appear to be fundamentally different from spectral methods.

Phase Transition for Stochastic Block Model with more than $\sqrt{n}$ Communities (II)

TL;DR

This work extends the understanding of community recovery in the SBM to the regime by proving Conjecture 1.4 of Carpentier et al. through a motif-based approach. It introduces a blow-up cycle motif with fasteners, constructs a robust motif-counting estimator using Median-of-Means, and derives explicit mean-variance formulas under conditional distributions to ensure reliable recovery above a new density-threshold. The results indicate that, in moderately sparse regimes, recovery relies on combinatorial motif counting rather than spectral methods, clarifying the computational barrier and guiding algorithm design in the many-communities setting. Overall, the paper completes the computational-barrier picture for SBM with large numbers of communities and provides a pathway to polynomial-time recovery in regimes previously thought hard.

Abstract

A fundamental theoretical question in network analysis is to determine under which conditions community recovery is possible in polynomial time in the Stochastic Block Model (SBM). When the number of communities remains smaller than --where denotes the number of nodes--, non-trivial community recovery is possible in polynomial time above, and only above, the Kesten--Stigum (KS) threshold, originally postulated using arguments from statistical physics. When , Chin, Mossel, Sohn, and Wein recently proved that, in the \emph{sparse regime}, community recovery in polynomial time is achievable below the KS threshold by counting non-backtracking paths. This finding led them to postulate a new threshold for the many-communities regime . Subsequently, Carpentier, Giraud, and Verzelen established the failure of low-degree polynomials below this new threshold across all density regimes, and demonstrated successful recovery above the threshold in certain moderately sparse settings. While these results provide strong evidence that, in the many community setting, the computational barrier lies at the threshold proposed in~Chin et al., the question of achieving recovery above this threshold still remains open in most density regimes. The present work is a follow-up to~Carpentier et al., in which we prove Conjecture~1.4 stated therein by: \\ 1- Constructing a family of motifs satisfying specific structural properties; and\\ 2- Proving that community recovery is possible above the proposed threshold by counting such motifs.\\ Our results complete the picture of the computational barrier for community recovery in the SBM with communities. They also indicate that, in moderately sparse regimes, the optimal algorithms appear to be fundamentally different from spectral methods.

Paper Structure

This paper contains 20 sections, 8 theorems, 70 equations, 1 figure.

Key Result

Proposition 3.1

Fix any positive integer $I\leq |V_{\mathrm{cyc}}|$ and consider any partition of $V$ into $I+1$ communities such that both $v_1$ and $v_2$ are in the same community. Define $E^{\neq}\subset E$ as the set of edges between nodes of distinct communities. Then, we have

Figures (1)

  • Figure 1: Illustration of the blow-up graph with fasteners $G_{\kappa,\gamma,a}$, in the case where $\gamma = 2, \kappa = 8, a=0.25$. The distinguished nodes $v_1,v_2$ are in orange, the fastener nodes (in $V_{\mathrm{fst}}$) are in red, the other cycle nodes are in blue. The fastener edges (in $E_{\mathrm{fst}}$) are in red, while the other edges are in gray.

Theorems & Definitions (13)

  • Conjecture 2.1: Conjecture 1.4 in carpentier2025phase
  • Proposition 3.1
  • Proposition 3.2
  • Theorem 3.3
  • proof : Proof of Theorem \ref{['thm:blow-up']}
  • Corollary 3.4
  • Lemma 4.1
  • Lemma 4.2
  • proof : Proof of Lemma \ref{['lem:Ecyc_k']}
  • proof : Proof of Lemma \ref{['lem:eq:out_0']}
  • ...and 3 more