Robust recovery for stochastic block models, simplified and generalized
Sidhanth Mohanty, Prasad Raghavendra, David X. Wu
TL;DR
This work establishes that robust recovery of communities in sparse SBMs under adversarial edge corruptions is possible once the KS threshold is crossed, i.e. $\lambda_2(T)^2 d>1$. It develops a three‑piece strategy: leverage Bethe Hessian outliers to identify community directions, apply a robust sparse PCA‑style subspace recovery to cope with sparse adversarial noise, and round the recovered subspace into a labeling that remains well correlated with the planted partition. The authors prove tight spectral properties, including a controlled number of outlier eigenvalues and a degree truncation argument that preserves the signal, extending robust recovery to arbitrary fixed $k$ and matching the KS threshold as the computational barrier. Overall, the approach yields a principled, robust spectral pipeline that can tolerate $\Omega(n)$ corruptions and deliver constant correlation with the true communities in polynomial time, advancing the understanding of information–computational tradeoffs in SBM inference.
Abstract
We study the problem of $\textit{robust community recovery}$: efficiently recovering communities in sparse stochastic block models in the presence of adversarial corruptions. In the absence of adversarial corruptions, there are efficient algorithms when the $\textit{signal-to-noise ratio}$ exceeds the $\textit{Kesten--Stigum (KS) threshold}$, widely believed to be the computational threshold for this problem. The question we study is: does the computational threshold for robust community recovery also lie at the KS threshold? We answer this question affirmatively, providing an algorithm for robust community recovery for arbitrary stochastic block models on any constant number of communities, generalizing the work of Ding, d'Orsi, Nasser & Steurer on an efficient algorithm above the KS threshold in the case of $2$-community block models. There are three main ingredients to our work: (i) The Bethe Hessian of the graph is defined as $H_G(t) \triangleq (D_G-I)t^2 - A_Gt + I$ where $D_G$ is the diagonal matrix of degrees and $A_G$ is the adjacency matrix. Empirical work suggested that the Bethe Hessian for the stochastic block model has outlier eigenvectors corresponding to the communities right above the Kesten-Stigum threshold. We formally confirm the existence of outlier eigenvalues for the Bethe Hessian, by explicitly constructing outlier eigenvectors from the community vectors. (ii) We develop an algorithm for a variant of robust PCA on sparse matrices. Specifically, an algorithm to partially recover top eigenspaces from adversarially corrupted sparse matrices under mild delocalization constraints. (iii) A rounding algorithm to turn vector assignments of vertices into a community assignment, inspired by the algorithm of Charikar \& Wirth \cite{CW04} for $2$XOR.
