Overcoming Representation Bias in Fairness-Aware data Repair using Optimal Transport

Abigail Langbridge; Anthony Quinn; Robert Shorten

Overcoming Representation Bias in Fairness-Aware data Repair using Optimal Transport

Abigail Langbridge, Anthony Quinn, Robert Shorten

TL;DR

A novel definition of the fair distributional target, along with quantifiers that allow us to trade fairness against damage in the transformed data, are formulated and used to reveal excellent performance of the representation-bias-tolerant scheme in simulated and benchmark data sets.

Abstract

Optimal transport (OT) has an important role in transforming data distributions in a manner which engenders fairness. Typically, the OT operators are learnt from the unfair attribute-labelled data, and then used for their repair. Two significant limitations of this approach are as follows: (i) the OT operators for underrepresented subgroups are poorly learnt (i.e. they are susceptible to representation bias); and (ii) these OT repairs cannot be effected on identically distributed but out-of-sample (i.e.\ archival) data. In this paper, we address both of these problems by adopting a Bayesian nonparametric stopping rule for learning each attribute-labelled component of the data distribution. The induced OT-optimal quantization operators can then be used to repair the archival data. We formulate a novel definition of the fair distributional target, along with quantifiers that allow us to trade fairness against damage in the transformed data. These are used to reveal excellent performance of our representation-bias-tolerant scheme in simulated and benchmark data sets.

Overcoming Representation Bias in Fairness-Aware data Repair using Optimal Transport

TL;DR

Abstract

Paper Structure (19 sections, 20 equations, 5 figures, 5 tables)

This paper contains 19 sections, 20 equations, 5 figures, 5 tables.

Introduction
AI Unfairness and Representation Bias
Societal and AI Unfairness
Representation Bias
Bayesian Learning of the Sub-Group Models, $\mathsf{F}_{u,s}$
Data-Driven Fairness Correction
Learning the Repair Operation
Evaluating (Un)Fairness
Data Damage
Experiments
Learning non-Gaussian Models
Multinomial Models
Gaussian Mixture Models
Fairness Repair Under Representation Bias
Benchmarking on Simulated and Real Data
...and 4 more sections

Figures (5)

Figure 1: Comparison of the causal graphs of different sources of unfairness under our conditional independence model.
Figure 2: LKLD until stopping for observations from categorical variables with $q$ states.
Figure 3:
Figure 4: Stopping numbers, $\hat{n}$, sample mean, $\mathbb{E}(X)$, and sample variance, $S^2(X)$, for GMM data over 500 Monte-Carlo simulations.
Figure 5: $\log \hat{E}$ and damage (Definition \ref{['def:kld_damage']}) for data with simulated representation bias $Pr[U = 0] \in (0, 0.5]$. Results are reported $\pm$ standard deviation over 20 Monte-Carlo simulations.

Theorems & Definitions (2)

Definition 2.1: Representation bias
Definition 4.1

Overcoming Representation Bias in Fairness-Aware data Repair using Optimal Transport

TL;DR

Abstract

Overcoming Representation Bias in Fairness-Aware data Repair using Optimal Transport

Authors

TL;DR

Abstract

Table of Contents

Figures (5)

Theorems & Definitions (2)