Graph Quasirandomness for Hypothesis Testing of Stochastic Block Models
Kiril Bangachev, Guy Bresler
TL;DR
This work develops a quasirandomness-inspired framework for hypothesis testing between ${\mathbb G}(n,1/2)$ and stochastic block models by analyzing signed subgraph counts. It shows that, under several SBM regimes, approximate maximizers of the scaled Fourier coefficients $|\Phi(H)|^{1/|V(H)|}$ are achieved by a small set of simple graphs (edges, stars, 4-cycles, triangles), enabling constant-degree polynomial distinguishers based on these counts. A central contribution is the leaf-isolation technique and a nonnegative-model comparison that together bound general SBMs by these baseline testers, yielding testing guarantees in multiple SBM settings, including diagonal, nonnegative, and two-community models. The results connect Fourier-analytic SBM quantities to partition functions of associated spin systems, offering practical, near-linear to near-quadratic computable statistics with implications for graphon testing and low-degree hardness frameworks. The paper also outlines a rich set of examples, barrier discussions, and future directions toward sparse regimes, vertex-transitive testing, and broader computational-inference connections.
Abstract
The celebrated theorem of Chung, Graham, and Wilson on quasirandom graphs implies that if the 4-cycle and edge counts in a graph $G$ are both close to their typical number in $\mathbb{G}(n,1/2),$ then this also holds for the counts of subgraphs isomorphic to $H$ for any $H$ of constant size. We aim to prove a similar statement where the notion of close is whether the given (signed) subgraph count can be used as a test between $\mathbb{G}(n,1/2)$ and a stochastic block model $\mathbb{SBM}.$ Quantitatively, this is related to approximately maximizing $H \longrightarrow |Φ(H)|^{\frac{1}{|\mathsf{V}(H)|}},$ where $Φ(H)$ is the Fourier coefficient of $\mathbb{SBM}$, indexed by subgraph $H.$ This formulation turns out to be equivalent to approximately maximizing the partition function of a spin model over alphabet equal to the community labels in $\mathbb{SBM}.$ We resolve the approximate maximization when $\mathbb{SBM}$ satisfies one of four conditions: 1) the probability of an edge between any two vertices in different communities is exactly $1/2$; 2) the probability of an edge between two vertices from any two communities is at least $1/2$ (this case is also covered in a recent work of Yu, Zadik, and Zhang); 3) the probability of belonging to any given community is at least $c$ for some universal constant $c>0$; 4) $\mathbb{SBM}$ has two communities. In each of these cases, we show that there is an approximate maximizer of $|Φ(H)|^{\frac{1}{|\mathsf{V}(H)|}}$ in the set $\mathsf{A} = \{\text{stars, 4-cycle}\}.$ This implies that if there exists a constant-degree polynomial test distinguishing $\mathbb{G}(n,1/2)$ and $\mathbb{SBM},$ then the two distributions can also be distinguished via the signed count of some graph in $\mathsf{A}.$ We conjecture that the same holds true for distinguishing $\mathbb{G}(n,1/2)$ and any graphon if we also add triangles to $\mathsf{A}.$
