Table of Contents
Fetching ...

FairAD: Computationally Efficient Fair Graph Clustering via Algebraic Distance

Minh Phu Vuong, Young-Ju Lee, Iván Ojeda-Ruiz, Chul-Ho Lee

TL;DR

FairAD tackles fair graph clustering by embedding fairness constraints into the affinity construction using algebraic distance, followed by graph coarsening to identify representative anchors and a constrained minimization to assign all nodes. The method leverages a constrained Jacobi approach with Uzawa updates and the Woodbury identity for efficiency, plus GPU-accelerated computation and advanced multigrid solvers for large systems. Empirical results on synthetic and real networks show FairAD achieves strong fairness (balanced clusters) while delivering up to $40\times$ faster runtimes than existing fair clustering methods. This yields a scalable, practical solution for fair clustering in large graphs with demographic constraints.

Abstract

Due to the growing concern about unsavory behaviors of machine learning models toward certain demographic groups, the notion of 'fairness' has recently drawn much attention from the community, thereby motivating the study of fairness in graph clustering. Fair graph clustering aims to partition the set of nodes in a graph into $k$ disjoint clusters such that the proportion of each protected group within each cluster is consistent with the proportion of that group in the entire dataset. It is, however, computationally challenging to incorporate fairness constraints into existing graph clustering algorithms, particularly for large graphs. To address this problem, we propose FairAD, a computationally efficient fair graph clustering method. It first constructs a new affinity matrix based on the notion of algebraic distance such that fairness constraints are imposed. A graph coarsening process is then performed on this affinity matrix to find representative nodes that correspond to $k$ clusters. Finally, a constrained minimization problem is solved to obtain the solution of fair clustering. Experiment results on the modified stochastic block model and six public datasets show that FairAD can achieve fair clustering while being up to 40 times faster compared to state-of-the-art fair graph clustering algorithms.

FairAD: Computationally Efficient Fair Graph Clustering via Algebraic Distance

TL;DR

FairAD tackles fair graph clustering by embedding fairness constraints into the affinity construction using algebraic distance, followed by graph coarsening to identify representative anchors and a constrained minimization to assign all nodes. The method leverages a constrained Jacobi approach with Uzawa updates and the Woodbury identity for efficiency, plus GPU-accelerated computation and advanced multigrid solvers for large systems. Empirical results on synthetic and real networks show FairAD achieves strong fairness (balanced clusters) while delivering up to faster runtimes than existing fair clustering methods. This yields a scalable, practical solution for fair clustering in large graphs with demographic constraints.

Abstract

Due to the growing concern about unsavory behaviors of machine learning models toward certain demographic groups, the notion of 'fairness' has recently drawn much attention from the community, thereby motivating the study of fairness in graph clustering. Fair graph clustering aims to partition the set of nodes in a graph into disjoint clusters such that the proportion of each protected group within each cluster is consistent with the proportion of that group in the entire dataset. It is, however, computationally challenging to incorporate fairness constraints into existing graph clustering algorithms, particularly for large graphs. To address this problem, we propose FairAD, a computationally efficient fair graph clustering method. It first constructs a new affinity matrix based on the notion of algebraic distance such that fairness constraints are imposed. A graph coarsening process is then performed on this affinity matrix to find representative nodes that correspond to clusters. Finally, a constrained minimization problem is solved to obtain the solution of fair clustering. Experiment results on the modified stochastic block model and six public datasets show that FairAD can achieve fair clustering while being up to 40 times faster compared to state-of-the-art fair graph clustering algorithms.

Paper Structure

This paper contains 12 sections, 1 theorem, 39 equations, 6 figures, 3 tables, 3 algorithms.

Key Result

Lemma 1

Let $(x^0,\lambda^0)$ be a given initial guess, and for $\ell\ge1$ let $(x^{\ell},\lambda^{\ell})$ be the iterates produced by the augmented Lagrangian Uzawa method. Denote by $\gamma_0$ the smallest eigenvalue of $\mathbf{F}^\top\mathbf{D}^{-1}\mathbf{F}$. Then the following holds:

Figures (6)

  • Figure 1: An overview of FairAD.
  • Figure 2: Running time of FairAD with and without CuPy.
  • Figure 3: Running time of FairAD with and without PyAMG.
  • Figure 4: Error rate in (\ref{['er']}), shown in the top row, and running time (in seconds), shown in the bottom row, under synthetic networks generated by mSBM with varying values of $h$ and $k$.
  • Figure 5: Average balance for NBA, German, and LastFM datasets (top row) and for Recidivism, Deezer, and Credit datasets (bottom row), when changing the number of clusters.
  • ...and 1 more figures

Theorems & Definitions (1)

  • Lemma 1