Table of Contents
Fetching ...

Reranking individuals: The effect of fair classification within-groups

Sofie Goethals, Marco Favier, Toon Calders

TL;DR

This paper investigates the within-group effects of bias mitigation in fair classification, arguing that traditional between-group fairness analyses miss important intra-group reranking dynamics. By formalizing a framework with a biased score $S(x,a)$ and a fair probability $p(Y=1\vert X=x,A=a)$, it shows that, in the absence of within-group bias, fair decisions can be decomposed into group-specific thresholds, making threshold optimization a powerful baseline. Using five real-world datasets and a suite of bias mitigation methods from AIF360, the study demonstrates that preprocessing and inprocessing methods often substantially alter intra-group rankings, while postprocessing methods mainly adjust labels without changing the underlying score rankings. It concludes that evaluating bias mitigation should prioritize prediction scores (AUC) and per-group performance, rather than solely relying on final labels, to capture real-world constraints and fairness outcomes, and discusses the affirmative-action-like decomposition as a condition under which within-group reranking is unnecessary.

Abstract

Artificial Intelligence (AI) finds widespread application across various domains, but it sparks concerns about fairness in its deployment. The prevailing discourse in classification often emphasizes outcome-based metrics comparing sensitive subgroups without a nuanced consideration of the differential impacts within subgroups. Bias mitigation techniques not only affect the ranking of pairs of instances across sensitive groups, but often also significantly affect the ranking of instances within these groups. Such changes are hard to explain and raise concerns regarding the validity of the intervention. Unfortunately, these effects remain under the radar in the accuracy-fairness evaluation framework that is usually applied. Additionally, we illustrate the effect of several popular bias mitigation methods, and how their output often does not reflect real-world scenarios.

Reranking individuals: The effect of fair classification within-groups

TL;DR

This paper investigates the within-group effects of bias mitigation in fair classification, arguing that traditional between-group fairness analyses miss important intra-group reranking dynamics. By formalizing a framework with a biased score and a fair probability , it shows that, in the absence of within-group bias, fair decisions can be decomposed into group-specific thresholds, making threshold optimization a powerful baseline. Using five real-world datasets and a suite of bias mitigation methods from AIF360, the study demonstrates that preprocessing and inprocessing methods often substantially alter intra-group rankings, while postprocessing methods mainly adjust labels without changing the underlying score rankings. It concludes that evaluating bias mitigation should prioritize prediction scores (AUC) and per-group performance, rather than solely relying on final labels, to capture real-world constraints and fairness outcomes, and discusses the affirmative-action-like decomposition as a condition under which within-group reranking is unnecessary.

Abstract

Artificial Intelligence (AI) finds widespread application across various domains, but it sparks concerns about fairness in its deployment. The prevailing discourse in classification often emphasizes outcome-based metrics comparing sensitive subgroups without a nuanced consideration of the differential impacts within subgroups. Bias mitigation techniques not only affect the ranking of pairs of instances across sensitive groups, but often also significantly affect the ranking of instances within these groups. Such changes are hard to explain and raise concerns regarding the validity of the intervention. Unfortunately, these effects remain under the radar in the accuracy-fairness evaluation framework that is usually applied. Additionally, we illustrate the effect of several popular bias mitigation methods, and how their output often does not reflect real-world scenarios.
Paper Structure (20 sections, 2 theorems, 25 equations, 6 figures, 3 tables)

This paper contains 20 sections, 2 theorems, 25 equations, 6 figures, 3 tables.

Key Result

Theorem 3.7

The Affirmative Action assumption holds if and only if for all $a\in A$ it holds that for almost all $(x,a), (x',a)\in X\times \{a\}$

Figures (6)

  • Figure 1: Score distributions for the Compas dataset. The x-axis represents the prediction scores of the initial ML model, while the y-axis represents the prediction score after applying each bias mitigation method. The second quadrant represents the instances that are 'upgraded' by the bias mitigation method (initially predicted as negative, and after using the bias mitigation method predicted as positive), while the fourth quadrant represents the instances that are 'downgraded' by the bias mitigation method (initially predicted as positive, and after the bias mitigation method predicted as negative).
  • Figure 2: Correlation plots
  • Figure 3: Score distributions for the Adult dataset
  • Figure 4: Score distributions for the Dutch dataset
  • Figure 5: Score distributions for the Law dataset
  • ...and 1 more figures

Theorems & Definitions (10)

  • Example 3.1: part 1
  • Definition 3.2
  • Definition 3.3: Pareto Order
  • Example 3.4: part 2
  • Definition 3.5: Affirmative Action Assumption
  • Example 3.6: part 3
  • Theorem 3.7
  • proof
  • Theorem A.1
  • proof