Variational Inference: Posterior Threshold Improves Network Clustering Accuracy in Sparse Regimes
Xuezhen Li, Can M. Le
TL;DR
This work addresses community detection via variational inference in sparse networks, where existing theory largely covers dense graphs and variational losses suffer from saddle points. It proposes a simple hard-thresholding step on the posterior of node labels after each BCAVI iteration, yielding Threshold BCAVI (T-BCAVI). The authors prove nontrivial guarantees under both SBM and DCSBM: the thresholded method achieves accurate recovery with high probability in sparse regimes, provides consistent parameter estimates, and, when degrees grow, establishes asymptotic normality for these estimates; the DCSBM extension shows rate-optimal clustering. Empirically, T-BCAVI outperforms classical BCAVI and a leading sparse-network algorithm in extensive simulations and real data, without requiring pre-processing, highlighting its practical impact for scalable, robust network clustering.
Abstract
Variational inference has been widely used in machine learning literature to fit various Bayesian models. In network analysis, this method has been successfully applied to solve the community detection problems. Although these results are promising, their theoretical support is only for relatively dense networks, an assumption that may not hold for real networks. In addition, it has been shown recently that the variational loss surface has many saddle points, which may severely affect its performance, especially when applied to sparse networks. This paper proposes a simple way to improve the variational inference method by hard thresholding the posterior of the community assignment after each iteration. Using a random initialization that correlates with the true community assignment, we show that the proposed method converges and can accurately recover the true community labels, even when the average node degree of the network is bounded. Extensive numerical study further confirms the advantage of the proposed method over the classical variational inference and another state-of-the-art algorithm.
