Table of Contents
Fetching ...

TopoFair: Linking Topological Bias to Fairness in Link Prediction Benchmarks

Lilian Marey, Mathilde Perez, Tiphaine Viard, Charlotte Laclau

TL;DR

The paper tackles fairness in graph link prediction by arguing that topological biases beyond homophily shape predictive fairness. It introduces a unified taxonomy of structural bias measures and a bias-controlled, non-learning graph generator to synthesize graphs with varied bias profiles, enabling controlled LP benchmarking. A three-part benchmarking framework (graph generation, LP training/evaluation, and bias-fairness dependency analysis) is used to compare classical and fairness-aware LP models across synthetic use cases inspired by real datasets. Key findings show that structural biases robustly influence fairness outcomes, that different fairness methods exhibit use-case-specific dependencies on topology, and that robustness improves only when multiple biases are jointly accounted for, highlighting the need for structurally grounded fairness evaluations with practical implications for deploying fair LP systems.

Abstract

Graph link prediction (LP) plays a critical role in socially impactful applications, such as job recommendation and friendship formation. Ensuring fairness in this task is thus essential. While many fairness-aware methods manipulate graph structures to mitigate prediction disparities, the topological biases inherent to social graph structures remain poorly understood and are often reduced to homophily alone. This undermines the generalization potential of fairness interventions and limits their applicability across diverse network topologies. In this work, we propose a novel benchmarking framework for fair LP, centered on the structural biases of the underlying graphs. We begin by reviewing and formalizing a broad taxonomy of topological bias measures relevant to fairness in graphs. In parallel, we introduce a flexible graph generation method that simultaneously ensures fidelity to real-world graph patterns and enables controlled variation across a wide spectrum of structural biases. We apply this framework to evaluate both classical and fairness-aware LP models across multiple use cases. Our results provide a fine-grained empirical analysis of the interactions between predictive fairness and structural biases. This new perspective reveals the sensitivity of fairness interventions to beyond-homophily biases and underscores the need for structurally grounded fairness evaluations in graph learning.

TopoFair: Linking Topological Bias to Fairness in Link Prediction Benchmarks

TL;DR

The paper tackles fairness in graph link prediction by arguing that topological biases beyond homophily shape predictive fairness. It introduces a unified taxonomy of structural bias measures and a bias-controlled, non-learning graph generator to synthesize graphs with varied bias profiles, enabling controlled LP benchmarking. A three-part benchmarking framework (graph generation, LP training/evaluation, and bias-fairness dependency analysis) is used to compare classical and fairness-aware LP models across synthetic use cases inspired by real datasets. Key findings show that structural biases robustly influence fairness outcomes, that different fairness methods exhibit use-case-specific dependencies on topology, and that robustness improves only when multiple biases are jointly accounted for, highlighting the need for structurally grounded fairness evaluations with practical implications for deploying fair LP systems.

Abstract

Graph link prediction (LP) plays a critical role in socially impactful applications, such as job recommendation and friendship formation. Ensuring fairness in this task is thus essential. While many fairness-aware methods manipulate graph structures to mitigate prediction disparities, the topological biases inherent to social graph structures remain poorly understood and are often reduced to homophily alone. This undermines the generalization potential of fairness interventions and limits their applicability across diverse network topologies. In this work, we propose a novel benchmarking framework for fair LP, centered on the structural biases of the underlying graphs. We begin by reviewing and formalizing a broad taxonomy of topological bias measures relevant to fairness in graphs. In parallel, we introduce a flexible graph generation method that simultaneously ensures fidelity to real-world graph patterns and enables controlled variation across a wide spectrum of structural biases. We apply this framework to evaluate both classical and fairness-aware LP models across multiple use cases. Our results provide a fine-grained empirical analysis of the interactions between predictive fairness and structural biases. This new perspective reveals the sensitivity of fairness interventions to beyond-homophily biases and underscores the need for structurally grounded fairness evaluations in graph learning.
Paper Structure (39 sections, 18 equations, 32 figures, 10 tables, 1 algorithm)

This paper contains 39 sections, 18 equations, 32 figures, 10 tables, 1 algorithm.

Figures (32)

  • Figure 1: Comparison between real (top) and generated (bottom) graphs with degree distributions, for Opinion (left), Friendship (center), and Collab (right) use cases. Red and blue colors indicate respectively sensitive and non-sensitive nodes. All generation parameters are fitted on real datasets.
  • Figure 2: Examples of structural bias $\rightarrow$ Fairness regression feature importance scores. From left to right: Opinion$SP$ N2V, Opinion$SP$ SVD, Friendship$SP$ N2V.
  • Figure 3: $\textsc{assortativity}$ (top) and $\textsc{heterogeneity}$ (bottom) across class imbalance and homophily parameters, in Opinion (left), Friendship (middle), and Collab (right) use cases.
  • Figure 4: Box plots of fair models on generated use cases. The red crosses represent the results on the real datasets: Collab, Polblogs, and Facebook.
  • Figure 5: $\textsc{assortativity}$ values of generated graphs with respect to homophily parameter $\beta$ in the three scenarios (other graph generation parameters are fitted to real datasets)
  • ...and 27 more figures

Theorems & Definitions (1)

  • Definition 3.1: Node-level Disparity