Counting communities in weighted Stochastic Block Models via semidefinite programming
Deborah Oliveira, Andressa Cerqueira, Roberto Oliveira
TL;DR
This work addresses estimating the number of communities in weighted, balanced SBMs by developing SDP-based hypothesis tests and estimators. A key novelty is a universality result that replaces the SDP functional of centered, sub-gamma weight matrices with a GOE surrogate, enabling precise thresholding and decision rules for distinguishing candidate numbers of communities and recovering memberships. The approach yields consistent sequential estimation of K and provides partial recovery guarantees for community assignments in both two-community and multi-community cases, under explicit mean-gap and weight-variance conditions. An illustrative zero-inflated Gaussian model demonstrates the method’s applicability, and simulations show SDP-based methods outperform several alternatives in sparse-to-moderate regimes. Overall, the findings extend SDP-based community detection to weighted SBMs and establish a practical, theoretically grounded pathway for learning the true number of communities in complex networks.
Abstract
We consider the problem of estimating the number of communities in a weighted balanced Stochastic Block Model. We construct hypothesis tests based on semidefinite programming and with a statistic coming from a GOE matrix to distinguish between any two candidate numbers of communities. This is possible due to a universality result for a semidefinite programming-based function that we also prove. The tests are then used to form a sequential test to estimate the number of communities. Furthermore, we also construct estimators of the communities themselves.
