Table of Contents
Fetching ...

Beyond Asymptotics: Practical Insights into Community Detection in Complex Networks

Tianjun Ke, Zhiyu Xu

TL;DR

This work tackles the practical question of how finite-sample performance of community-detection algorithms in the stochastic block model compares across common inference paradigms. It benchmarks Gibbs sampling, variational Bayes, variational-EM, and spectral methods (including SCORE, $L_2$ normalization, and regularized spectral clustering) under varied SNR, heterogeneous community sizes, and multimodal connectivity, reporting results with ARI/NMI on extensive simulations. Key findings show SCORE dominating spectral methods, Gibbs sampling excelling in small, well-separated networks, and variational-EM offering the best trade-off for larger networks, while variational Bayes often underperforms. The results highlight clear practical trade-offs and motivate further theory for SBMs with complex structures and imbalance; code is available at the provided GitHub URL.)

Abstract

The stochastic block model (SBM) is a fundamental tool for community detection in networks, yet the finite-sample performance of inference methods remains underexplored. We evaluate key algorithms-spectral methods, variational inference, and Gibbs sampling-under varying conditions, including signal-to-noise ratios, heterogeneous community sizes, and multimodality. Our results highlight significant performance variations: spectral methods, especially SCORE, excel in computational efficiency and scalability, while Gibbs sampling dominates in small, well-separated networks. Variational Expectation-Maximization strikes a balance between accuracy and cost in larger networks but struggles with optimization in highly imbalanced settings. These findings underscore the practical trade-offs among methods and provide actionable guidance for algorithm selection in real-world applications. Our results also call for further theoretical investigation in SBMs with complex structures. The code can be found at https://github.com/Toby-X/SBM_computation.

Beyond Asymptotics: Practical Insights into Community Detection in Complex Networks

TL;DR

This work tackles the practical question of how finite-sample performance of community-detection algorithms in the stochastic block model compares across common inference paradigms. It benchmarks Gibbs sampling, variational Bayes, variational-EM, and spectral methods (including SCORE, normalization, and regularized spectral clustering) under varied SNR, heterogeneous community sizes, and multimodal connectivity, reporting results with ARI/NMI on extensive simulations. Key findings show SCORE dominating spectral methods, Gibbs sampling excelling in small, well-separated networks, and variational-EM offering the best trade-off for larger networks, while variational Bayes often underperforms. The results highlight clear practical trade-offs and motivate further theory for SBMs with complex structures and imbalance; code is available at the provided GitHub URL.)

Abstract

The stochastic block model (SBM) is a fundamental tool for community detection in networks, yet the finite-sample performance of inference methods remains underexplored. We evaluate key algorithms-spectral methods, variational inference, and Gibbs sampling-under varying conditions, including signal-to-noise ratios, heterogeneous community sizes, and multimodality. Our results highlight significant performance variations: spectral methods, especially SCORE, excel in computational efficiency and scalability, while Gibbs sampling dominates in small, well-separated networks. Variational Expectation-Maximization strikes a balance between accuracy and cost in larger networks but struggles with optimization in highly imbalanced settings. These findings underscore the practical trade-offs among methods and provide actionable guidance for algorithm selection in real-world applications. Our results also call for further theoretical investigation in SBMs with complex structures. The code can be found at https://github.com/Toby-X/SBM_computation.

Paper Structure

This paper contains 22 sections, 18 equations, 2 figures, 1 algorithm.

Figures (2)

  • Figure 1: ARI comparisons of all methods with 100 different random seeds. L2 denotes the one-class SVM method. VEMB and VEMG denote variational-EM with the Bernoulli model and Gaussian model respectively. Other methods are represented by abbreviation. SCORE dominates other spectral methods with RSC performing the worst. Variational Bayes has inferior performance compared to variational-EM methods. Variational-EM deteriorates in performance as $N$ increases in challenging scenarios. Gibbs sampling only dominates in simple networks despite being an exact method.
  • Figure 2: NMI comparisons of all methods with 100 different random seeds. L2 denotes the one-class SVM method. VEMB and VEMG denote variational-EM with the Bernoulli model and Gaussian model respectively. Other methods are represented by abbreviation. Results are similar to ARI.