Table of Contents
Fetching ...

Bayan Algorithm: Detecting Communities in Networks Through Exact and Approximate Optimization of Modularity

Samin Aref, Mahdi Mostajabdaveh, Hriday Chheda

TL;DR

This work compares 30 community detection methods including the proposed algorithm that offers optimality and approximation guarantees: the Bayan algorithm, and points to a few well-performing algorithms, among which Bayan stands out as the most reliable method for small networks.

Abstract

Community detection is a classic network problem with extensive applications in various fields. Its most common method is using modularity maximization heuristics which rarely return an optimal partition or anything similar. Partitions with globally optimal modularity are difficult to compute, and therefore have been underexplored. Using structurally diverse networks, we compare 30 community detection methods including our proposed algorithm that offers optimality and approximation guarantees: the Bayan algorithm. Unlike existing methods, Bayan globally maximizes modularity or approximates it within a factor. Our results show the distinctive accuracy and stability of maximum-modularity partitions in retrieving planted partitions at rates higher than most alternatives for a wide range of parameter settings in two standard benchmarks. Compared to the partitions from 29 other algorithms, maximum-modularity partitions have the best medians for description length, coverage, performance, average conductance, and well clusteredness. These advantages come at the cost of additional computations which Bayan makes possible for small networks (networks that have up to 3000 edges in their largest connected component). Bayan is several times faster than using open-source and commercial solvers for modularity maximization, making it capable of finding optimal partitions for instances that cannot be optimized by any other existing method. Our results point to a few well performing algorithms, among which Bayan stands out as the most reliable method for small networks. A Python implementation of the Bayan algorithm (bayanpy) is publicly available through the package installer for Python.

Bayan Algorithm: Detecting Communities in Networks Through Exact and Approximate Optimization of Modularity

TL;DR

This work compares 30 community detection methods including the proposed algorithm that offers optimality and approximation guarantees: the Bayan algorithm, and points to a few well-performing algorithms, among which Bayan stands out as the most reliable method for small networks.

Abstract

Community detection is a classic network problem with extensive applications in various fields. Its most common method is using modularity maximization heuristics which rarely return an optimal partition or anything similar. Partitions with globally optimal modularity are difficult to compute, and therefore have been underexplored. Using structurally diverse networks, we compare 30 community detection methods including our proposed algorithm that offers optimality and approximation guarantees: the Bayan algorithm. Unlike existing methods, Bayan globally maximizes modularity or approximates it within a factor. Our results show the distinctive accuracy and stability of maximum-modularity partitions in retrieving planted partitions at rates higher than most alternatives for a wide range of parameter settings in two standard benchmarks. Compared to the partitions from 29 other algorithms, maximum-modularity partitions have the best medians for description length, coverage, performance, average conductance, and well clusteredness. These advantages come at the cost of additional computations which Bayan makes possible for small networks (networks that have up to 3000 edges in their largest connected component). Bayan is several times faster than using open-source and commercial solvers for modularity maximization, making it capable of finding optimal partitions for instances that cannot be optimized by any other existing method. Our results point to a few well performing algorithms, among which Bayan stands out as the most reliable method for small networks. A Python implementation of the Bayan algorithm (bayanpy) is publicly available through the package installer for Python.
Paper Structure (36 sections, 9 equations, 11 figures, 2 tables)

This paper contains 36 sections, 9 equations, 11 figures, 2 tables.

Figures (11)

  • Figure 1: Performance ranking of 30 CD algorithms based on AMI averaged over 100 LFR graphs for each data point. (Magnify the high-resolution figure on screen for details.)
  • Figure 2: Performance ranking of 30 CD algorithms based on AMI averaged over 100 ABCD graphs for each data point. (Magnify the high-resolution figure on screen for details.)
  • Figure 3: Adjusted mutual information (AMI) values of the CD algorithms indicating the similarity of their partitions with node attributes for five real networks (Color version online. Magnify the high-resolution figure on screen for details.)
  • Figure 4: The distribution of description length values for the partitions produced by each algorithm on the 500 LFR networks (top panel) ad the 500 ABCD networks (bottom panel). The algorithms are sorted from right to left based on having a more desirable median description length.
  • Figure 5: The distribution of modularity values for the partitions produced by each algorithm on the 500 LFR networks (top panel) ad the 500 ABCD networks (bottom panel). The algorithms are sorted from right to left based on having a more desirable median modularity.
  • ...and 6 more figures