Table of Contents
Fetching ...

Individual Fairness in Community Detection: Quantitative Measure and Comparative Evaluation

Fabrizio Corriera, Frank W. Takes, Akrati Saxena

TL;DR

A novel measure to quantify individual fairness in community detection methods, computed using the community co-occurrence matrix, is introduced, showing that individual unfairness can occur even when group fairness or clustering accuracy is high, underscoring that individual and group fairness are not interchangeable.

Abstract

Community detection is a fundamental task in complex network analysis. Fairness-aware community detection seeks to prevent biased node partitions, typically framed in terms of individual fairness, which requires similar nodes to be treated similarly, and group fairness, which aims to avoid disadvantaging specific groups of nodes. While existing literature on fair community detection has primarily focused on group fairness, we introduce a novel measure to quantify individual fairness in community detection methods. The proposed measure captures unfairness as the vectorial distance between a node's true and predicted community representations, computed using the community co-occurrence matrix. We provide a comprehensive empirical investigation of a broad set of community detection algorithms from the literature on both synthetic networks, with varying levels of community explicitness, and real-world networks. We particularly investigate the fairness-performance trade-off using standard quality metrics and compare individual fairness outcomes with existing group fairness measures. The results show that individual unfairness can occur even when group fairness or clustering accuracy is high, underscoring that individual and group fairness are not interchangeable. Moreover, fairness depends critically on the detectability of community structure. However, we find that Significance and Surprise for denser graphs, and Combo, Leiden, and SBMDL for sparser graphs result in a better trade-off between individual fairness and community quality. Overall, our findings, together with the fact that community detection is an important step in many network analysis downstream tasks, highlight the necessity of developing fairness-aware community detection methods.

Individual Fairness in Community Detection: Quantitative Measure and Comparative Evaluation

TL;DR

A novel measure to quantify individual fairness in community detection methods, computed using the community co-occurrence matrix, is introduced, showing that individual unfairness can occur even when group fairness or clustering accuracy is high, underscoring that individual and group fairness are not interchangeable.

Abstract

Community detection is a fundamental task in complex network analysis. Fairness-aware community detection seeks to prevent biased node partitions, typically framed in terms of individual fairness, which requires similar nodes to be treated similarly, and group fairness, which aims to avoid disadvantaging specific groups of nodes. While existing literature on fair community detection has primarily focused on group fairness, we introduce a novel measure to quantify individual fairness in community detection methods. The proposed measure captures unfairness as the vectorial distance between a node's true and predicted community representations, computed using the community co-occurrence matrix. We provide a comprehensive empirical investigation of a broad set of community detection algorithms from the literature on both synthetic networks, with varying levels of community explicitness, and real-world networks. We particularly investigate the fairness-performance trade-off using standard quality metrics and compare individual fairness outcomes with existing group fairness measures. The results show that individual unfairness can occur even when group fairness or clustering accuracy is high, underscoring that individual and group fairness are not interchangeable. Moreover, fairness depends critically on the detectability of community structure. However, we find that Significance and Surprise for denser graphs, and Combo, Leiden, and SBMDL for sparser graphs result in a better trade-off between individual fairness and community quality. Overall, our findings, together with the fact that community detection is an important step in many network analysis downstream tasks, highlight the necessity of developing fairness-aware community detection methods.
Paper Structure (26 sections, 6 equations, 14 figures, 6 tables)

This paper contains 26 sections, 6 equations, 14 figures, 6 tables.

Figures (14)

  • Figure 1: The $IB$ calculation process. On the left is shown a graph with ground-truth communities (coded by colour) and communities identified by a CD method (represented by the dotted ellipses). We first compute the $CC$ matrix using the ground-truth communities and the predicted communities (respectively on top and bottom of the image). We then obtain $IB$ (on the right), applying the vectorial distance between the rows of the two matrices. The values shown are obtained using the cosine distance.
  • Figure 2: $IB$ behaviour of the expanding context of a node belonging to the minority community ((a), (b) and (c)) and majority community ((d), (e) and (f)). (a) and (d) correspond to graphs with 100 nodes, (b) and (e) correspond to graphs with 1,000 nodes, and (d) and (f) correspond to graphs with 10,000 nodes.
  • Figure 3: $IB$ behaviour of the shrinking context of a node belonging to the minority community ((a), (b) and (c)) and majority community ((d), (e) and (f)). (a) and (d) correspond to graphs with 100 nodes, (b) and (e) correspond to graphs with 1,000 nodes, and (d) and (f) correspond to graphs with 10,000 nodes.
  • Figure 4: $IB$ behaviour of the changing context of a node belonging to the minority community ((a), (b) and (c)) and majority community ((d), (e) and (f)). (a) and (d) correspond to graphs with 100 nodes, (b) and (e) correspond to graphs with 1,000 nodes, and (d) and (f) correspond to graphs with 10,000 nodes.
  • Figure 5: Mean $IB$ for graph of 10,000 nodes in different context variation patterns: (a) context expansion, (b) context shrinking, and (c) context change. The blue line represents the mean of the minority community's behaviour (orange triangles) and the majority community's behaviour (green squares).
  • ...and 9 more figures