Two-way Node Popularity Model for Directed and Bipartite Networks
Bing-Yi Jing, Ting Li, Jiangzhou Wang, Ya Wang
TL;DR
TNPM extends community detection to directed and bipartite networks by modeling edge means as $P_{ij} = E(A_{ij}) = \Lambda(i,z_j) \widetilde{\Lambda}(j,c_i)$, yielding a block rank-one mean structure. The authors develop the Delete-One-Method (DOM) and the Two-Stage Divided Cosine Algorithm (TSDC) to fit TNPM and identify communities with unknown $K$ and $L$, accommodating sub-Gaussian edge distributions. They prove identifiability under mild assumptions and establish consistency of both the probability estimator and community detection, with finite-sample bounds. Empirical results on synthetic data and two real datasets—the Worldwide Food Trading Networks and MovieLens 100K—demonstrate improved accuracy and scalability, and reveal interpretable, domain-relevant structure.
Abstract
There has been extensive research on community detection in directed and bipartite networks. However, these studies often fail to consider the popularity of nodes in different communities, which is a common phenomenon in real-world networks. To address this issue, we propose a new probabilistic framework called the Two-Way Node Popularity Model (TNPM). The TNPM also accommodates edges from different distributions within a general sub-Gaussian family. We introduce the Delete-One-Method (DOM) for model fitting and community structure identification, and provide a comprehensive theoretical analysis with novel technical skills dealing with sub-Gaussian generalization. Additionally, we propose the Two-Stage Divided Cosine Algorithm (TSDC) to handle large-scale networks more efficiently. Our proposed methods offer multi-folded advantages in terms of estimation accuracy and computational efficiency, as demonstrated through extensive numerical studies. We apply our methods to two real-world applications, uncovering interesting findings.
