Table of Contents
Fetching ...

A Semidefinite Relaxation Approach for Fair Graph Clustering

Sina Baharlouei, Sadra Sabouri

TL;DR

This study introduces fair graph clustering within the framework of the disparate impact doctrine, treating it as a joint optimization problem integrating clustering quality and fairness constraints, and employs a semidefinite relaxation approach to approximate the underlying optimization problem.

Abstract

Fair graph clustering is crucial for ensuring equitable representation and treatment of diverse communities in network analysis. Traditional methods often ignore disparities among social, economic, and demographic groups, perpetuating biased outcomes and reinforcing inequalities. This study introduces fair graph clustering within the framework of the disparate impact doctrine, treating it as a joint optimization problem integrating clustering quality and fairness constraints. Given the NP-hard nature of this problem, we employ a semidefinite relaxation approach to approximate the underlying optimization problem. For up to medium-sized graphs, we utilize a singular value decomposition-based algorithm, while for larger graphs, we propose a novel algorithm based on the alternative direction method of multipliers. Unlike existing methods, our formulation allows for tuning the trade-off between clustering quality and fairness. Experimental results on graphs generated from the standard stochastic block model demonstrate the superiority of our approach in achieving an optimal accuracy-fairness trade-off compared to state-of-the-art methods.

A Semidefinite Relaxation Approach for Fair Graph Clustering

TL;DR

This study introduces fair graph clustering within the framework of the disparate impact doctrine, treating it as a joint optimization problem integrating clustering quality and fairness constraints, and employs a semidefinite relaxation approach to approximate the underlying optimization problem.

Abstract

Fair graph clustering is crucial for ensuring equitable representation and treatment of diverse communities in network analysis. Traditional methods often ignore disparities among social, economic, and demographic groups, perpetuating biased outcomes and reinforcing inequalities. This study introduces fair graph clustering within the framework of the disparate impact doctrine, treating it as a joint optimization problem integrating clustering quality and fairness constraints. Given the NP-hard nature of this problem, we employ a semidefinite relaxation approach to approximate the underlying optimization problem. For up to medium-sized graphs, we utilize a singular value decomposition-based algorithm, while for larger graphs, we propose a novel algorithm based on the alternative direction method of multipliers. Unlike existing methods, our formulation allows for tuning the trade-off between clustering quality and fairness. Experimental results on graphs generated from the standard stochastic block model demonstrate the superiority of our approach in achieving an optimal accuracy-fairness trade-off compared to state-of-the-art methods.

Paper Structure

This paper contains 9 sections, 12 equations, 6 figures, 2 algorithms.

Figures (6)

  • Figure 1: An example of weighted imbalanced graph clustering. Node's specificity feature is indicated with colors, i.e., red and blue, an edge boldness is proportional to its weight. Plus, the edge's length is inversely proportional to its weight. Node labels, i.e $\{-,+\}$, shows the predicted cluster. $AMI$ is Adjusted Mutual Information vinh2009information used to evaluate the predicted labels, $\hat{Y}$, to real temporal clusters, $Y$, and specificity $S$. a) The predicted labels for clusters are all based on temporal information. The clustering method ignores fairness and predicts clusters only based on cluster locations. c) Clustering is solely based on fairness criteria. b) A trade-off between the clustering and fairness (temporal vs. specificity features)
  • Figure 2: Clustering Accuracy and fairness trade-off for our model, iFairNMT GhodsiSN24, iFarSC gupta2022consistency (IndividualFairSC), GroupFairSC GhodsiSN24 and NormalSC GhodsiSN24 in a) adjusted mutual information vinh2009information, b) adjusted rand index chacon2023minimum and c) v measure rosenberg2007v using $1000$ samples.
  • Figure 3: Specificity and Temporal AMI change for $\mu=1$ as a function of $\lambda$ ($1000$ samples). While the changes concerning the $\lambda$ are not symmetric, the change is smoother when $\lambda < 0$ compared to the sharper change for $\lambda > 0$. When $\lambda > 0.5$ or $\lambda < -0.6$, our method clusters all nodes as one cluster, resulting in a dump in both specificity and temporal AMIs.
  • Figure 4: Specifity AMI and temporal AMI change per $\lambda$ when $\mu=1$ for 1000 samples. In this example, for $\lambda \leq -0.26$, the algorithm ignores the fairness; for $\lambda \geq -0.22$, the algorithm ignores the temporal information between nodes.
  • Figure 5: Change of specificity and temporal AMI per changing $\lambda$ when $\mu = \pm 1$ for 1000 samples. When $\mu = -1$, the trade-off range is wider than the case when $\mu = 1$.
  • ...and 1 more figures