Table of Contents
Fetching ...

EquiPy: Sequential Fairness using Optimal Transport in Python

Agathe Fernandes Machado, Suzie Grondin, Philipp Ratz, Arthur Charpentier, François Hu

TL;DR

EquiPy tackles DP fairness when multiple sensitive variables are present by leveraging optimal transport and Wasserstein barycenters to post-process predictions with minimal loss in accuracy. The method extends from single to multiple sensitive attributes via sequential barycentric corrections, yielding a fair predictor $f_B$ that optimally balances risk $\\\\\\\,\mathcal{R}(f)$ and DP-fairness. The package provides model-agnostic tools (FairWasserstein, MultiWasserstein), approximate fairness options, and rich visualization utilities, demonstrated on US Census data to quantify and decompose unfairness by attribute. This yields a practical, interpretable framework for deploying fair predictions in real pipelines while supporting decision-makers with clear trade-off visuals and diagnostics.

Abstract

Algorithmic fairness has received considerable attention due to the failures of various predictive AI systems that have been found to be unfairly biased against subgroups of the population. Many approaches have been proposed to mitigate such biases in predictive systems, however, they often struggle to provide accurate estimates and transparent correction mechanisms in the case where multiple sensitive variables, such as a combination of gender and race, are involved. This paper introduces a new open source Python package, EquiPy, which provides a easy-to-use and model agnostic toolbox for efficiently achieving fairness across multiple sensitive variables. It also offers comprehensive graphic utilities to enable the user to interpret the influence of each sensitive variable within a global context. EquiPy makes use of theoretical results that allow the complexity arising from the use of multiple variables to be broken down into easier-to-solve sub-problems. We demonstrate the ease of use for both mitigation and interpretation on publicly available data derived from the US Census and provide sample code for its use.

EquiPy: Sequential Fairness using Optimal Transport in Python

TL;DR

EquiPy tackles DP fairness when multiple sensitive variables are present by leveraging optimal transport and Wasserstein barycenters to post-process predictions with minimal loss in accuracy. The method extends from single to multiple sensitive attributes via sequential barycentric corrections, yielding a fair predictor that optimally balances risk and DP-fairness. The package provides model-agnostic tools (FairWasserstein, MultiWasserstein), approximate fairness options, and rich visualization utilities, demonstrated on US Census data to quantify and decompose unfairness by attribute. This yields a practical, interpretable framework for deploying fair predictions in real pipelines while supporting decision-makers with clear trade-off visuals and diagnostics.

Abstract

Algorithmic fairness has received considerable attention due to the failures of various predictive AI systems that have been found to be unfairly biased against subgroups of the population. Many approaches have been proposed to mitigate such biases in predictive systems, however, they often struggle to provide accurate estimates and transparent correction mechanisms in the case where multiple sensitive variables, such as a combination of gender and race, are involved. This paper introduces a new open source Python package, EquiPy, which provides a easy-to-use and model agnostic toolbox for efficiently achieving fairness across multiple sensitive variables. It also offers comprehensive graphic utilities to enable the user to interpret the influence of each sensitive variable within a global context. EquiPy makes use of theoretical results that allow the complexity arising from the use of multiple variables to be broken down into easier-to-solve sub-problems. We demonstrate the ease of use for both mitigation and interpretation on publicly available data derived from the US Census and provide sample code for its use.

Paper Structure

This paper contains 29 sections, 25 equations, 7 figures, 2 tables.

Figures (7)

  • Figure 1: Group-wise distribution of predictions and barycenter. Note that the $y$-axis represents the density and the barycenter contains the both of the group wise predictions.
  • Figure 2: Tree structure of the EquiPy package
  • Figure 3: Process of mitigating predictions using the methods of MultiWasserstein class
  • Figure 4: Group-wise model response distribution. The fair model predictions exhibit no variations across different groups.
  • Figure 5: Group-wise model response distribution for multiple sensitive attributes: ethnicity and sex. The fair model predictions exhibit no variations across different groups.
  • ...and 2 more figures