Table of Contents
Fetching ...

Quantifying Group Fairness in Community Detection

Elze de Vink, Frank W. Takes, Akrati Saxena

TL;DR

This work proposes a set of novel group fairness metrics to assess the fairness of community detection methods and observes that Infomap and Significance methods are high-performing and fair with respect to different types of communities across most networks.

Abstract

Understanding community structures is crucial for analyzing networks, as nodes join communities that collectively shape large-scale networks. In real-world settings, the formation of communities is often impacted by several social factors, such as ethnicity, gender, wealth, or other attributes. These factors may introduce structural inequalities; for instance, real-world networks can have a few majority groups and many minority groups. Community detection algorithms, which identify communities based on network topology, may generate unfair outcomes if they fail to account for existing structural inequalities, particularly affecting underrepresented groups. In this work, we propose a set of novel group fairness metrics to assess the fairness of community detection methods. Additionally, we conduct a comparative evaluation of the most common community detection methods, analyzing the trade-off between performance and fairness. Experiments are performed on synthetic networks generated using LFR, ABCD, and HICH-BA benchmark models, as well as on real-world networks. Our results demonstrate that the fairness-performance trade-off varies widely across methods, with no single class of approaches consistently excelling in both aspects. We observe that Infomap and Significance methods are high-performing and fair with respect to different types of communities across most networks. The proposed metrics and findings provide valuable insights for designing fair and effective community detection algorithms.

Quantifying Group Fairness in Community Detection

TL;DR

This work proposes a set of novel group fairness metrics to assess the fairness of community detection methods and observes that Infomap and Significance methods are high-performing and fair with respect to different types of communities across most networks.

Abstract

Understanding community structures is crucial for analyzing networks, as nodes join communities that collectively shape large-scale networks. In real-world settings, the formation of communities is often impacted by several social factors, such as ethnicity, gender, wealth, or other attributes. These factors may introduce structural inequalities; for instance, real-world networks can have a few majority groups and many minority groups. Community detection algorithms, which identify communities based on network topology, may generate unfair outcomes if they fail to account for existing structural inequalities, particularly affecting underrepresented groups. In this work, we propose a set of novel group fairness metrics to assess the fairness of community detection methods. Additionally, we conduct a comparative evaluation of the most common community detection methods, analyzing the trade-off between performance and fairness. Experiments are performed on synthetic networks generated using LFR, ABCD, and HICH-BA benchmark models, as well as on real-world networks. Our results demonstrate that the fairness-performance trade-off varies widely across methods, with no single class of approaches consistently excelling in both aspects. We observe that Infomap and Significance methods are high-performing and fair with respect to different types of communities across most networks. The proposed metrics and findings provide valuable insights for designing fair and effective community detection algorithms.

Paper Structure

This paper contains 33 sections, 3 equations, 15 figures, 3 tables.

Figures (15)

  • Figure 1: FCCN vs. normalized community size for each community on a small sample network. The best-fit line shows the trend of a community detection method.
  • Figure 2: Community-wise fairness metric score as the number of misclassified nodes in the predicted community increases. The plot shows the average FCCE values over 20 iterations, along with the highest and lowest recorded values at each point.
  • Figure 3: Analysis of (a) the behavior of community-wise performance metrics and (b) group fairness on a HICH-BA network having both minority and majority communities.
  • Figure 4: correlation between community properties — size, density, and conductance, in LFR, ABCD, and HICH-BA networks.
  • Figure 5: NMI vs. fairness of community detection methods with respect to community size for LFR networks of 10,000 nodes having different $\mu$ values.
  • ...and 10 more figures