Community detection in the hypergraph stochastic block model and reconstruction on hypertrees
Yuzhou Gu, Aaradhya Pandey
TL;DR
The paper analyzes weak recovery in the r-uniform hypergraph stochastic block model with two balanced communities, establishing that the Kesten-Stigum threshold $(r-1)d\lambda^2=1$ is tight for $r\le 6$ at large degree and not tight for certain regimes when $r\ge 5$ (notably disassortative or large $d$). Central to the results is a tight link to reconstruction on hypertrees (BOHT), where belief-propagation-based analysis yields large-degree contraction results via a Gaussian-approximation function $g_{r,w}$ and a robust reconstruction expansion that preserves contraction at criticality. The paper also provides a non-asymptotic upper bound for HSBM weak recovery showing that KS can be non-tight in a broad range of parameters, and it derives asymptotic bounds on the weak recovery threshold, together with implications for the BOHT reconstruction thresholds, thereby contributing evidence for an information-computation gap in several regimes. Overall, the work advances the understanding of when efficient algorithms can achieve weak recovery in hypergraph models and when computational barriers may arise, guided by precise hypertree analyses and KL-based information bounds.
Abstract
We study the weak recovery problem on the $r$-uniform hypergraph stochastic block model ($r$-HSBM) with two balanced communities. In this model, $n$ vertices are randomly divided into two communities, and size-$r$ hyperedges are added randomly depending on whether all vertices in the hyperedge are in the same community. The goal of weak recovery is to recover a non-trivial fraction of the communities given the hypergraph. Pal and Zhu (2021); Stephan and Zhu (2022) established that weak recovery is always possible above a natural threshold called the Kesten-Stigum (KS) threshold. For assortative models (i.e., monochromatic hyperedges are preferred), Gu and Polyanskiy (2023) proved that the KS threshold is tight if $r\le 4$ or the expected degree $d$ is small. For other cases, the tightness of the KS threshold remained open. In this paper we determine the tightness of the KS threshold for a wide range of parameters. We prove that for $r\le 6$ and $d$ large enough, the KS threshold is tight. This shows that there is no information-computation gap in this regime and partially confirms a conjecture of Angelini et al. (2015). On the other hand, we show that for $r\ge 5$, there exist parameters for which the KS threshold is not tight. In particular, for $r\ge 7$, the KS threshold is not tight if the model is disassortative (i.e., polychromatic hyperedges are preferred) or $d$ is large enough. This provides more evidence supporting the existence of an information-computation gap in these cases. Furthermore, we establish asymptotic bounds on the weak recovery threshold for fixed $r$ and large $d$. We also obtain a number of results regarding the broadcasting on hypertrees (BOHT) model, including the asymptotics of the reconstruction threshold for $r\ge 7$ and impossibility of robust reconstruction at criticality.
