Table of Contents
Fetching ...

Lower Ricci Curvature for Efficient Community Detection

Yun Jin Park, Didong Li

TL;DR

This work addresses the computational bottleneck of curvature-based community detection by introducing Lower Ricci Curvature (LRC), a discrete curvature with linear cost $O(mn)$ and scale-free behavior. It couples LRC with a non-iterative preprocessing step that uses a two-component Gaussian Mixture Model to threshold and prune low-LRC edges, thereby sharpening community structure for a wide range of algorithms. Across simulations on SBM benchmarks and real-world networks (NCAA, DBLP, Amazon, YouTube), LRC preprocessing consistently improves accuracy (e.g., ARI/AMI/F1) and often reduces runtimes, underscoring both effectiveness and efficiency on large-scale graphs. The method is theoretically grounded via connections to the Cheeger constant and diameter bounds, and is extensible to weighted graphs with publicly available code and data for further study.

Abstract

This study introduces the Lower Ricci Curvature (LRC), a novel, scalable, and scale-free discrete curvature designed to enhance community detection in networks. Addressing the computational challenges posed by existing curvature-based methods, LRC offers a streamlined approach with linear computational complexity, making it well-suited for large-scale network analysis. We further develop an LRC-based preprocessing method that effectively augments popular community detection algorithms. Through comprehensive simulations and applications on real-world datasets, including the NCAA football league network, the DBLP collaboration network, the Amazon product co-purchasing network, and the YouTube social network, we demonstrate the efficacy of our method in significantly improving the performance of various community detection algorithms.

Lower Ricci Curvature for Efficient Community Detection

TL;DR

This work addresses the computational bottleneck of curvature-based community detection by introducing Lower Ricci Curvature (LRC), a discrete curvature with linear cost and scale-free behavior. It couples LRC with a non-iterative preprocessing step that uses a two-component Gaussian Mixture Model to threshold and prune low-LRC edges, thereby sharpening community structure for a wide range of algorithms. Across simulations on SBM benchmarks and real-world networks (NCAA, DBLP, Amazon, YouTube), LRC preprocessing consistently improves accuracy (e.g., ARI/AMI/F1) and often reduces runtimes, underscoring both effectiveness and efficiency on large-scale graphs. The method is theoretically grounded via connections to the Cheeger constant and diameter bounds, and is extensible to weighted graphs with publicly available code and data for further study.

Abstract

This study introduces the Lower Ricci Curvature (LRC), a novel, scalable, and scale-free discrete curvature designed to enhance community detection in networks. Addressing the computational challenges posed by existing curvature-based methods, LRC offers a streamlined approach with linear computational complexity, making it well-suited for large-scale network analysis. We further develop an LRC-based preprocessing method that effectively augments popular community detection algorithms. Through comprehensive simulations and applications on real-world datasets, including the NCAA football league network, the DBLP collaboration network, the Amazon product co-purchasing network, and the YouTube social network, we demonstrate the efficacy of our method in significantly improving the performance of various community detection algorithms.
Paper Structure (18 sections, 2 theorems, 9 equations, 3 figures, 5 tables, 1 algorithm)

This paper contains 18 sections, 2 theorems, 9 equations, 3 figures, 5 tables, 1 algorithm.

Key Result

Proposition 1

For any edge $(ij)$,

Figures (3)

  • Figure 1: (a) A toy simulated network with two communities, with edges colored by ORC. (b) The histogram of ORC, suggesting its potential in community detection.
  • Figure 2: (a) A SBM-generated network with $K=2$, $B_{kk}=0.8$, $B_{kl}=0.05$ for $k\neq l$, with edges colored by LRC. (b) The histogram of LRC, suggesting its potential in community detection. (c) The histogram of $\Delta$ for within and across community edges, indicating that $\Delta$ may not significantly contribute to community detection.
  • Figure 3: (a) A SBM-generated network with $K=2$, $B_{kk}=0.8$, $B_{kl}=0.05$ for $k\neq l$, with edges colored by LRC. (b) The histogram of LRC, with each bar colored by LRC. (c) The threshold $\beta$ (the dotted vertical line) estimated by GMM. GMM1 is the mixing component with a large mean $\mu_2$, GMM2 is the mixing component with a smaller mean $\mu_1$. (d) The processed network, exhibiting a more discernible community structure.

Theorems & Definitions (5)

  • Definition 1: Forman Ricci Curvature (FRC)
  • Definition 2: Balanced Forman Curvature (BFC)
  • Definition 3: Ollivier-Ricci Curvature (ORC)
  • Proposition 1
  • Corollary 1