10 Years of Fair Representations: Challenges and Opportunities

Mattia Cerrato; Marius Köppel; Philipp Wolf; Stefan Kramer

10 Years of Fair Representations: Challenges and Opportunities

Mattia Cerrato, Marius Köppel, Philipp Wolf, Stefan Kramer

TL;DR

This paper revisits a decade of Fair Representation Learning (FRL), formalizing the goal to compress $X$ into a representation $Z$ that minimizes $I(Z; S)$ while preserving $I(Z; Y)$, and highlighting the fundamental trade-offs via $\min_{\theta} (1-\gamma) \mathcal{L}_{class}(\theta) + \gamma \mathcal{L}_{fair}(\theta)$ and the mutual-information formulation $\min_{Z} I(Z; S)$ with $I(Z; Y) \ge \alpha$. It introduces a theoretical impossibility for deterministic, infinite-precision FRL with injective activations, showing $I(S; Z^i)=I(S; X)$ across layers, and accompanies this with a massive empirical evaluation using EvalFRL on six datasets and eight FRL methods, aided by AutoML to probe residual $S$-dependence. The results indicate that many deterministic FRL methods fail to remove sensitive information from learned representations, while stochastic or quantized variants can achieve stronger invariance, aligning with the proposed impossibility. The work argues for rigorous, dual-frame evaluation, transparent reporting, and a shift toward stochastic/quantized approaches and physics-informed datasets to realize FRL’s real-world fairness potential.

Abstract

Fair Representation Learning (FRL) is a broad set of techniques, mostly based on neural networks, that seeks to learn new representations of data in which sensitive or undesired information has been removed. Methodologically, FRL was pioneered by Richard Zemel et al. about ten years ago. The basic concepts, objectives and evaluation strategies for FRL methodologies remain unchanged to this day. In this paper, we look back at the first ten years of FRL by i) revisiting its theoretical standing in light of recent work in deep learning theory that shows the hardness of removing information in neural network representations and ii) presenting the results of a massive experimentation (225.000 model fits and 110.000 AutoML fits) we conducted with the objective of improving on the common evaluation scenario for FRL. More specifically, we use automated machine learning (AutoML) to adversarially "mine" sensitive information from supposedly fair representations. Our theoretical and experimental analysis suggests that deterministic, unquantized FRL methodologies have serious issues in removing sensitive information, which is especially troubling as they might seem "fair" at first glance.

10 Years of Fair Representations: Challenges and Opportunities

TL;DR

This paper revisits a decade of Fair Representation Learning (FRL), formalizing the goal to compress

into a representation

that minimizes

while preserving

, and highlighting the fundamental trade-offs via

and the mutual-information formulation

with

. It introduces a theoretical impossibility for deterministic, infinite-precision FRL with injective activations, showing

across layers, and accompanies this with a massive empirical evaluation using EvalFRL on six datasets and eight FRL methods, aided by AutoML to probe residual

-dependence. The results indicate that many deterministic FRL methods fail to remove sensitive information from learned representations, while stochastic or quantized variants can achieve stronger invariance, aligning with the proposed impossibility. The work argues for rigorous, dual-frame evaluation, transparent reporting, and a shift toward stochastic/quantized approaches and physics-informed datasets to realize FRL’s real-world fairness potential.

Abstract

Paper Structure (19 sections, 1 theorem, 9 equations, 25 figures, 1 algorithm)

This paper contains 19 sections, 1 theorem, 9 equations, 25 figures, 1 algorithm.

Introduction
Related Work
Challenges in Fair Representation Learning
Experiments
EvalFRL: An Evaluation Library for Fair Representation Learning
Fairness Metrics
Results in Fair Allocation
Results in Invariant Representations
Fair Representation Learning: The Next 10 Years
Detailed Experimental Setup
Models
Dataset Information
Other Fairness Metrics
Other Results in Fair Allocation
ReLU Activation Tests
...and 4 more sections

Key Result

Theorem 1

Let X, Y, and S be the random variables representing data, labels and the sensitive attributes, respectively. Let $\phi^i(x) = \sigma(A^i \phi^{i-1}(x) + b^i)$ be the $i$-th layer function of a DNN, where $A$ is a weight matrix, $b$ a bias vector, and $\sigma$ an injective non-linearity. Let $Z^i =

Figures (25)

Figure 1: Accuracy vs. 1 - AUC-Discrimination tradeoff for all six dataset and eight model combinations. Each model is displayed for different $\gamma$ values indicated via a colored point inside the model marker.
Figure 2: AUC vs. 1 - AUC-Discrimination tradeoff for all six dataset and eight model combinations. Each model is displayed for different $\gamma$ values indicated via a colored point inside the model marker.
Figure 3: AutoML AUC vs. gamma results for all six dataset and eight model combinations.
Figure 4: AutoML ACC vs. gamma results for all six dataset and eight model combinations.
Figure 5: A graphical summary of EvalFRL, our experimentation library for FRL algorithms. It shows an overview of the Kedro pipeline used for preprocessing, hyperparameter optimization, $\gamma$ experiments and AutoML evaluation.
...and 20 more figures

Theorems & Definitions (2)

Theorem 1
proof

10 Years of Fair Representations: Challenges and Opportunities

TL;DR

Abstract

10 Years of Fair Representations: Challenges and Opportunities

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (25)

Theorems & Definitions (2)