Balanced Stochastic Block Model for Community Detection in Signed Networks
Yichao Chen, Weijing Tang, Ji Zhu
TL;DR
The paper tackles community detection in signed networks by introducing the Balanced Stochastic Block Model (BSBM), which imposes a population-level balance via a two-meta-group hierarchy on community signs. It develops a fast profile-pseudo likelihood EM-based estimator that decouples row and column labels, with a constrained update for the sign-probability matrix implemented through a max-cut optimization, and proves strong consistency under weaker signal conditions than unsigned SBMs. Through extensive simulations, BSBM demonstrates superior performance when edge connectivity signals are weak but sign information is informative, and across varying meta-group sizes, network scales, and numbers of communities. Real-world applications to international relations and protein interaction networks show meaningful, interpretable communities aligned with known structures and biological pathways, underscoring the practical impact of leveraging balance theory in signed networks.
Abstract
Community detection, discovering the underlying communities within a network from observed connections, is a fundamental problem in network analysis, yet it remains underexplored for signed networks. In signed networks, both edge connection patterns and edge signs are informative, and structural balance theory (e.g., triangles aligned with ``the enemy of my enemy is my friend'' and ``the friend of my friend is my friend'' are more prevalent) provides a global higher-order principle that guides community formation. We propose a Balanced Stochastic Block Model (BSBM), which incorporates balance theory into the network generating process such that balanced triangles are more likely to occur. We develop a fast profile pseudo-likelihood estimation algorithm with provable convergence and establish that our estimator achieves strong consistency under weaker signal conditions than methods for the binary SBM that rely solely on edge connectivity. Extensive simulation studies and two real-world signed networks demonstrate strong empirical performance.
