Table of Contents
Fetching ...

Incorporating Fairness in Neighborhood Graphs for Fair Spectral Clustering

Adithya K Moorthy, V Vijaya Saradhi, Bhanu Prasad

TL;DR

This work tackles bias in spectral clustering by integrating fairness into the graph construction phase. It introduces fair $k$-NN and fair $\varepsilon$-neighborhood graphs that enforce demographic parity at the neighborhood level as a preprocessing step, preserving geometric structure while improving representation of sensitive groups. Across synthetic and real-world datasets, the proposed methods achieve higher Balance and competitive clustering quality compared with baselines, with $\alpha$-driven increases in fairness and robustness to parameter choices. The approach offers a practical, efficient path to fair unsupervised clustering that does not require changes to the clustering algorithm itself.

Abstract

Graph clustering plays a pivotal role in unsupervised learning methods like spectral clustering, yet traditional methods for graph clustering often perpetuate bias through unfair graph constructions that may underrepresent some groups. The current research introduces novel approaches for constructing fair k-nearest neighbor (kNN) and fair epsilon-neighborhood graphs that proactively enforce demographic parity during graph formation. By incorporating fairness constraints at the earliest stage of neighborhood selection steps, our approaches incorporate proportional representation of sensitive features into the local graph structure while maintaining geometric consistency.Our work addresses a critical gap in pre-processing for fair spectral clustering, demonstrating that topological fairness in graph construction is essential for achieving equitable clustering outcomes. Widely used graph construction methods like kNN and epsilon-neighborhood graphs propagate edge based disparate impact on sensitive groups, leading to biased clustering results. Providing representation of each sensitive group in the neighborhood of every node leads to fairer spectral clustering results because the topological features of the graph naturally reflect equitable group ratios. This research fills an essential shortcoming in fair unsupervised learning, by illustrating how topological fairness in graph construction inherently facilitates fairer spectral clustering results without the need for changes to the clustering algorithm itself. Thorough experiments on three synthetic datasets, seven real-world tabular datasets, and three real-world image datasets prove that our fair graph construction methods surpass the current baselines in graph clustering tasks.

Incorporating Fairness in Neighborhood Graphs for Fair Spectral Clustering

TL;DR

This work tackles bias in spectral clustering by integrating fairness into the graph construction phase. It introduces fair -NN and fair -neighborhood graphs that enforce demographic parity at the neighborhood level as a preprocessing step, preserving geometric structure while improving representation of sensitive groups. Across synthetic and real-world datasets, the proposed methods achieve higher Balance and competitive clustering quality compared with baselines, with -driven increases in fairness and robustness to parameter choices. The approach offers a practical, efficient path to fair unsupervised clustering that does not require changes to the clustering algorithm itself.

Abstract

Graph clustering plays a pivotal role in unsupervised learning methods like spectral clustering, yet traditional methods for graph clustering often perpetuate bias through unfair graph constructions that may underrepresent some groups. The current research introduces novel approaches for constructing fair k-nearest neighbor (kNN) and fair epsilon-neighborhood graphs that proactively enforce demographic parity during graph formation. By incorporating fairness constraints at the earliest stage of neighborhood selection steps, our approaches incorporate proportional representation of sensitive features into the local graph structure while maintaining geometric consistency.Our work addresses a critical gap in pre-processing for fair spectral clustering, demonstrating that topological fairness in graph construction is essential for achieving equitable clustering outcomes. Widely used graph construction methods like kNN and epsilon-neighborhood graphs propagate edge based disparate impact on sensitive groups, leading to biased clustering results. Providing representation of each sensitive group in the neighborhood of every node leads to fairer spectral clustering results because the topological features of the graph naturally reflect equitable group ratios. This research fills an essential shortcoming in fair unsupervised learning, by illustrating how topological fairness in graph construction inherently facilitates fairer spectral clustering results without the need for changes to the clustering algorithm itself. Thorough experiments on three synthetic datasets, seven real-world tabular datasets, and three real-world image datasets prove that our fair graph construction methods surpass the current baselines in graph clustering tasks.

Paper Structure

This paper contains 19 sections, 14 equations, 5 figures, 4 tables, 3 algorithms.

Figures (5)

  • Figure 1: Left side of Figure (a) shows a graph constructed with unfair neighborhoods, resulting in a biased clustering outcome. A portion of the unfair neighborhood is shown on the right side. Left side of Figure (b) shows a graph constructed with fair neighborhood, leading to a more balanced clustering outcome. A portion of that fair neighborhood is shown on the right side.
  • Figure 2: From the left, assuming $k=4$, the first figure shows the initial kNN graph constructed with unfair neighborhoods, leading to a biased clustering outcome. The second figure illustrates the process of adjusting neighborhoods to ensure fairness, where nodes from the same sensitive group are replaced with nodes from a different sensitive group. The third figure shows the final fair kNN graph, which leads to a more balanced neighborhood.
  • Figure 3: The process of constructing fair $\epsilon$-neighborhood graphs. The left side shows the unfair neighborhood graph and how it is adjusted to ensure fairness. The right side shows the fair neighborhood graph after adjustments. The nodes are colored based on their sensitive group membership, and the dotted lines represent edges in the neighborhood graph.
  • Figure 4: Change in Balance with the Disparate Impact parameter $\alpha$ in synthetic dataset
  • Figure 5: Figures (a) to (h) show the change in Balance with $k$ for fair_kNN graph construction on the real-world datasets. Figures (i) to (p) show the change in Balance with $\epsilon$ for Fair_$\epsilon$-neighborhood graph construction on the real-world datasets. In Figures (a) to (h), the x-axis represents the $k$ and the y-axis represents the Balance of the clusters. In Figures (i) to (p), the x-axis represents the value of $\epsilon$ and the y-axis represents the Balance of the clusters. The Figures (a) to (h), each represents a real-world dataset in the given order: Adult, Bank, Catalonia, COMPAS, Credit, Crime, Law, and Student. Similarly, the Figures (i) to (p), each represents a real-world dataset in the given order: Adult, Bank, Catalonia, COMPAS, Credit, Crime, Law, and Student. The pink line represents the maximum achievable Balance for each dataset. The blue line represents the change in Balance when performing the graph construction with fairness constraints.

Theorems & Definitions (4)

  • Definition 1: Core Point
  • Definition 2: Direct Density Reachability
  • Definition 3: Density reachability
  • Definition 4: Fair Neighborhood