Fundamental Limits of Community Detection in Contextual Multi-Layer Stochastic Block Models
Shuyang Gong, Dong Huang, Zhangsong Li
TL;DR
This work studies community detection from a joint covariate matrix and multiple sparse graphs in the sparse, constant-degree regime, establishing sharp information-theoretic thresholds that govern detectability and label estimation. It introduces a unified contextual multi-layer SBM, derives a threshold function $F$ whose regime determines feasibility, and shows there is no statistical–computational gap for fixed numbers of layers. On the algorithmic side, it designs efficient detectors and weak-recovery estimators based on counting decorated cycles and decorated paths, with color-coding to ensure polynomial-time implementations, and proves these achieve the sharp thresholds. The authors also develop a rigorous information-theoretic lower bound framework, including a novel Bernoulli–Gaussian moment comparison and a recovery-to-detection reduction, and corroborate the theory with numerical experiments in sparse multi-layer settings. Overall, the paper advances understanding of multi-modal network inference in realistic sparse and noisy contexts and provides practical, threshold-achieving algorithms for joint contextual detection and recovery.
Abstract
We consider the problem of community detection from the joint observation of a high-dimensional covariate matrix and $L$ sparse networks, all encoding noisy, partial information about the latent community labels of $n$ subjects. In the asymptotic regime where the networks have constant average degree and the number of features $p$ grows proportionally with $n$, we derive a sharp threshold under which detecting and estimating the subject labels is possible. Our results extend the work of \cite{MN23} to the constant-degree regime with noisy measurements, and also resolve a conjecture in \cite{YLS24+} when the number of networks is a constant. Our information-theoretic lower bound is obtained via a novel comparison inequality between Bernoulli and Gaussian moments, as well as a statistical variant of the ``recovery to chi-square divergence reduction'' argument inspired by \cite{DHSS25}. On the algorithmic side, we design efficient algorithms based on counting decorated cycles and decorated paths and prove that they achieve the sharp threshold for both detection and weak recovery. In particular, our results show that there is no statistical-computational gap in this setting.
