Table of Contents
Fetching ...

Comparing Methods for Bias Mitigation in Graph Neural Networks

Barbara Hoffmann, Ruben Mayer

TL;DR

This paper tackles bias in Graph Neural Network (GNN)–guided data preparation for GenAI by comparing three mitigation strategies on the German credit dataset: data sparsification, feature modification, and synthetic data augmentation. It demonstrates that stratified sampling provides the most balanced fairness improvements with negligible accuracy loss, while GraphSAGE-based augmentation significantly reduces demographic gaps while maintaining high accuracy, albeit with a notable rise in false positive rate disparity. Feature modification yields strong fairness gains but may have limited real-world applicability due to potential pattern leakage and trade-offs in accuracy. Overall, the work offers practical guidance for deploying fair GNN-enabled data preparation pipelines that preserve task performance.

Abstract

This paper examines the critical role of Graph Neural Networks (GNNs) in data preparation for generative artificial intelligence (GenAI) systems, with a particular focus on addressing and mitigating biases. We present a comparative analysis of three distinct methods for bias mitigation: data sparsification, feature modification, and synthetic data augmentation. Through experimental analysis using the german credit dataset, we evaluate these approaches using multiple fairness metrics, including statistical parity, equality of opportunity, and false positive rates. Our research demonstrates that while all methods improve fairness metrics compared to the original dataset, stratified sampling and synthetic data augmentation using GraphSAGE prove particularly effective in balancing demographic representation while maintaining model performance. The results provide practical insights for developing more equitable AI systems while maintaining model performance.

Comparing Methods for Bias Mitigation in Graph Neural Networks

TL;DR

This paper tackles bias in Graph Neural Network (GNN)–guided data preparation for GenAI by comparing three mitigation strategies on the German credit dataset: data sparsification, feature modification, and synthetic data augmentation. It demonstrates that stratified sampling provides the most balanced fairness improvements with negligible accuracy loss, while GraphSAGE-based augmentation significantly reduces demographic gaps while maintaining high accuracy, albeit with a notable rise in false positive rate disparity. Feature modification yields strong fairness gains but may have limited real-world applicability due to potential pattern leakage and trade-offs in accuracy. Overall, the work offers practical guidance for deploying fair GNN-enabled data preparation pipelines that preserve task performance.

Abstract

This paper examines the critical role of Graph Neural Networks (GNNs) in data preparation for generative artificial intelligence (GenAI) systems, with a particular focus on addressing and mitigating biases. We present a comparative analysis of three distinct methods for bias mitigation: data sparsification, feature modification, and synthetic data augmentation. Through experimental analysis using the german credit dataset, we evaluate these approaches using multiple fairness metrics, including statistical parity, equality of opportunity, and false positive rates. Our research demonstrates that while all methods improve fairness metrics compared to the original dataset, stratified sampling and synthetic data augmentation using GraphSAGE prove particularly effective in balancing demographic representation while maintaining model performance. The results provide practical insights for developing more equitable AI systems while maintaining model performance.

Paper Structure

This paper contains 14 sections, 3 equations, 1 figure, 1 table.

Figures (1)

  • Figure 1: Fairness metric results for each modified dataset. Each method's distribution is repeated three times to account for random variation. Black bars indicate the standard deviation range, while red delta values highlight group differences. For fairness metrics, deltas are critical, whereas for accuracy, the focus is on absolute values.