Table of Contents
Fetching ...

Robust recovery for stochastic block models, simplified and generalized

Sidhanth Mohanty, Prasad Raghavendra, David X. Wu

TL;DR

This work establishes that robust recovery of communities in sparse SBMs under adversarial edge corruptions is possible once the KS threshold is crossed, i.e. $\lambda_2(T)^2 d>1$. It develops a three‑piece strategy: leverage Bethe Hessian outliers to identify community directions, apply a robust sparse PCA‑style subspace recovery to cope with sparse adversarial noise, and round the recovered subspace into a labeling that remains well correlated with the planted partition. The authors prove tight spectral properties, including a controlled number of outlier eigenvalues and a degree truncation argument that preserves the signal, extending robust recovery to arbitrary fixed $k$ and matching the KS threshold as the computational barrier. Overall, the approach yields a principled, robust spectral pipeline that can tolerate $\Omega(n)$ corruptions and deliver constant correlation with the true communities in polynomial time, advancing the understanding of information–computational tradeoffs in SBM inference.

Abstract

We study the problem of $\textit{robust community recovery}$: efficiently recovering communities in sparse stochastic block models in the presence of adversarial corruptions. In the absence of adversarial corruptions, there are efficient algorithms when the $\textit{signal-to-noise ratio}$ exceeds the $\textit{Kesten--Stigum (KS) threshold}$, widely believed to be the computational threshold for this problem. The question we study is: does the computational threshold for robust community recovery also lie at the KS threshold? We answer this question affirmatively, providing an algorithm for robust community recovery for arbitrary stochastic block models on any constant number of communities, generalizing the work of Ding, d'Orsi, Nasser & Steurer on an efficient algorithm above the KS threshold in the case of $2$-community block models. There are three main ingredients to our work: (i) The Bethe Hessian of the graph is defined as $H_G(t) \triangleq (D_G-I)t^2 - A_Gt + I$ where $D_G$ is the diagonal matrix of degrees and $A_G$ is the adjacency matrix. Empirical work suggested that the Bethe Hessian for the stochastic block model has outlier eigenvectors corresponding to the communities right above the Kesten-Stigum threshold. We formally confirm the existence of outlier eigenvalues for the Bethe Hessian, by explicitly constructing outlier eigenvectors from the community vectors. (ii) We develop an algorithm for a variant of robust PCA on sparse matrices. Specifically, an algorithm to partially recover top eigenspaces from adversarially corrupted sparse matrices under mild delocalization constraints. (iii) A rounding algorithm to turn vector assignments of vertices into a community assignment, inspired by the algorithm of Charikar \& Wirth \cite{CW04} for $2$XOR.

Robust recovery for stochastic block models, simplified and generalized

TL;DR

This work establishes that robust recovery of communities in sparse SBMs under adversarial edge corruptions is possible once the KS threshold is crossed, i.e. . It develops a three‑piece strategy: leverage Bethe Hessian outliers to identify community directions, apply a robust sparse PCA‑style subspace recovery to cope with sparse adversarial noise, and round the recovered subspace into a labeling that remains well correlated with the planted partition. The authors prove tight spectral properties, including a controlled number of outlier eigenvalues and a degree truncation argument that preserves the signal, extending robust recovery to arbitrary fixed and matching the KS threshold as the computational barrier. Overall, the approach yields a principled, robust spectral pipeline that can tolerate corruptions and deliver constant correlation with the true communities in polynomial time, advancing the understanding of information–computational tradeoffs in SBM inference.

Abstract

We study the problem of : efficiently recovering communities in sparse stochastic block models in the presence of adversarial corruptions. In the absence of adversarial corruptions, there are efficient algorithms when the exceeds the , widely believed to be the computational threshold for this problem. The question we study is: does the computational threshold for robust community recovery also lie at the KS threshold? We answer this question affirmatively, providing an algorithm for robust community recovery for arbitrary stochastic block models on any constant number of communities, generalizing the work of Ding, d'Orsi, Nasser & Steurer on an efficient algorithm above the KS threshold in the case of -community block models. There are three main ingredients to our work: (i) The Bethe Hessian of the graph is defined as where is the diagonal matrix of degrees and is the adjacency matrix. Empirical work suggested that the Bethe Hessian for the stochastic block model has outlier eigenvectors corresponding to the communities right above the Kesten-Stigum threshold. We formally confirm the existence of outlier eigenvalues for the Bethe Hessian, by explicitly constructing outlier eigenvectors from the community vectors. (ii) We develop an algorithm for a variant of robust PCA on sparse matrices. Specifically, an algorithm to partially recover top eigenspaces from adversarially corrupted sparse matrices under mild delocalization constraints. (iii) A rounding algorithm to turn vector assignments of vertices into a community assignment, inspired by the algorithm of Charikar \& Wirth \cite{CW04} for XOR.
Paper Structure (26 sections, 29 theorems, 67 equations, 2 algorithms)

This paper contains 26 sections, 29 theorems, 67 equations, 2 algorithms.

Key Result

Theorem 1.2

Let $(\mathrm{M}, \pi, d)$ be SBM parameters such that $d$ is above the KS threshold, and let $\boldsymbol{G},{\boldsymbol{x}}\sim\mathrm{SBM}_n(\mathrm{M}, \pi, d)$. There exists $\delta = \delta(\mathrm{M}, \pi, d) > 0$ such that the following holds. There is a polynomial time algorithm that takes

Theorems & Definitions (60)

  • Definition 1.1: Informal
  • Theorem 1.2: Informal statement of main theorem
  • Remark 1.3: Robustness against node corruptions
  • Proposition 2.1: Bethe Hessian spectrum
  • Proposition 2.2
  • Remark 2.3
  • Lemma 3.3
  • proof
  • Remark 4.2
  • Remark 4.3
  • ...and 50 more