Table of Contents
Fetching ...

When mitigating bias is unfair: multiplicity and arbitrariness in algorithmic group fairness

Natasa Krco, Thibault Laugel, Vincent Grari, Jean-Michel Loubes, Marcin Detyniecki

TL;DR

This paper addresses how bias mitigation in machine learning can be arbitrary and multiplicity-driven, even when global fairness and accuracy metrics are similar. It introduces the FRAME framework, a five-dimension evaluation tool that analyzes impact size, change direction, decision rates, affected subpopulations, and neglected subpopulations to reveal nuanced effects of debiasing methods. By applying FRAME to five tabular datasets and a range of pre-processing, in-processing, and post-processing debiasing strategies, the study demonstrates substantial differences in which individuals are affected and how global metrics may mask local unfairness. The findings argue for more transparent, multi-dimensional evaluation of debiasing processes and propose directions toward designing fairer, less arbitrary models with better consideration of individual and subpopulation impacts.

Abstract

Most research on fair machine learning has prioritized optimizing criteria such as Demographic Parity and Equalized Odds. Despite these efforts, there remains a limited understanding of how different bias mitigation strategies affect individual predictions and whether they introduce arbitrariness into the debiasing process. This paper addresses these gaps by exploring whether models that achieve comparable fairness and accuracy metrics impact the same individuals and mitigate bias in a consistent manner. We introduce the FRAME (FaiRness Arbitrariness and Multiplicity Evaluation) framework, which evaluates bias mitigation through five dimensions: Impact Size (how many people were affected), Change Direction (positive versus negative changes), Decision Rates (impact on models' acceptance rates), Affected Subpopulations (who was affected), and Neglected Subpopulations (where unfairness persists). This framework is intended to help practitioners understand the impacts of debiasing processes and make better-informed decisions regarding model selection. Applying FRAME to various bias mitigation approaches across key datasets allows us to exhibit significant differences in the behaviors of debiasing methods. These findings highlight the limitations of current fairness criteria and the inherent arbitrariness in the debiasing process.

When mitigating bias is unfair: multiplicity and arbitrariness in algorithmic group fairness

TL;DR

This paper addresses how bias mitigation in machine learning can be arbitrary and multiplicity-driven, even when global fairness and accuracy metrics are similar. It introduces the FRAME framework, a five-dimension evaluation tool that analyzes impact size, change direction, decision rates, affected subpopulations, and neglected subpopulations to reveal nuanced effects of debiasing methods. By applying FRAME to five tabular datasets and a range of pre-processing, in-processing, and post-processing debiasing strategies, the study demonstrates substantial differences in which individuals are affected and how global metrics may mask local unfairness. The findings argue for more transparent, multi-dimensional evaluation of debiasing processes and propose directions toward designing fairer, less arbitrary models with better consideration of individual and subpopulation impacts.

Abstract

Most research on fair machine learning has prioritized optimizing criteria such as Demographic Parity and Equalized Odds. Despite these efforts, there remains a limited understanding of how different bias mitigation strategies affect individual predictions and whether they introduce arbitrariness into the debiasing process. This paper addresses these gaps by exploring whether models that achieve comparable fairness and accuracy metrics impact the same individuals and mitigate bias in a consistent manner. We introduce the FRAME (FaiRness Arbitrariness and Multiplicity Evaluation) framework, which evaluates bias mitigation through five dimensions: Impact Size (how many people were affected), Change Direction (positive versus negative changes), Decision Rates (impact on models' acceptance rates), Affected Subpopulations (who was affected), and Neglected Subpopulations (where unfairness persists). This framework is intended to help practitioners understand the impacts of debiasing processes and make better-informed decisions regarding model selection. Applying FRAME to various bias mitigation approaches across key datasets allows us to exhibit significant differences in the behaviors of debiasing methods. These findings highlight the limitations of current fairness criteria and the inherent arbitrariness in the debiasing process.
Paper Structure (42 sections, 6 equations, 13 figures, 10 tables)

This paper contains 42 sections, 6 equations, 13 figures, 10 tables.

Figures (13)

  • Figure 1: Illustration of multiplicity in the debiasing process: replacing an existing (biased) model with a fair one is an underspecified problem. Several models (here, two are shown) achieve the same accuracy and fairness scores by adopting drastically different strategies.
  • Figure 2: How similar are the predictions of fair models with similar performance? IOU values for the sets of instances targeted by the considered debiasing algorithms, for Demographic Parity (left) and Equalized Odds (right).
  • Figure 3: Impact size (D1): Average and standard deviation number of instances impacted, for 10 runs of each method, for each dataset. Left: Demographic Parity methods. Right: Equalized Odds methods. Results are in $\%$ of the test set.
  • Figure 4: Change direction (D2): distribution of the observed differences along the sensitive groups and the change direction, for the Adult (left) and Dutch (right) datasets. The bar heights correspond to the proportion, out of the individuals affected by a change, of each sensitive group that got affected by a positive (full rectangles) or negative (hatched rectangles) difference.
  • Figure 5: Final decision rates on Demographic Parity (D3): $\mathbb{E}(\hat{Y})$ values achieved by bias mitigation models for each sensitive group of the Adult (left) and Credit (right) datasets).
  • ...and 8 more figures