Table of Contents
Fetching ...

Back to the Drawing Board for Fair Representation Learning

Angéline Pouget, Nikola Jovanović, Mark Vero, Robin Staab, Martin Vechev

TL;DR

This paper argues that FRL evaluation has drifted toward optimizing a single proxy task, which encourages overfitting and poor transfer to unseen tasks. It introduces TransFair, a benchmark with transfer-task–extended datasets and a protocol designed to measure FRL performance across multiple, diverse downstream tasks. Through TransFair-based experiments, the authors demonstrate that proxy-task–only optimization degrades transferability and that task-agnostic learning signals can improve generalization to weakly related tasks, albeit with tradeoffs on correlated tasks. The work advocates a shift toward evaluating FRL methods by their universal transfer utility, offering practical benchmarks and criteria to guide future dataset construction and method development. This has practical impact by enabling fair representations that remain useful across a broad set of real-world downstream applications.

Abstract

The goal of Fair Representation Learning (FRL) is to mitigate biases in machine learning models by learning data representations that enable high accuracy on downstream tasks while minimizing discrimination based on sensitive attributes. The evaluation of FRL methods in many recent works primarily focuses on the tradeoff between downstream fairness and accuracy with respect to a single task that was used to approximate the utility of representations during training (proxy task). This incentivizes retaining only features relevant to the proxy task while discarding all other information. In extreme cases, this can cause the learned representations to collapse to a trivial, binary value, rendering them unusable in transfer settings. In this work, we argue that this approach is fundamentally mismatched with the original motivation of FRL, which arises from settings with many downstream tasks unknown at training time (transfer tasks). To remedy this, we propose to refocus the evaluation protocol of FRL methods primarily around the performance on transfer tasks. A key challenge when conducting such an evaluation is the lack of adequate benchmarks. We address this by formulating four criteria that a suitable evaluation procedure should fulfill. Based on these, we propose TransFair, a benchmark that satisfies these criteria, consisting of novel variations of popular FRL datasets with carefully calibrated transfer tasks. In this setting, we reevaluate state-of-the-art FRL methods, observing that they often overfit to the proxy task, which causes them to underperform on certain transfer tasks. We further highlight the importance of task-agnostic learning signals for FRL methods, as they can lead to more transferrable representations.

Back to the Drawing Board for Fair Representation Learning

TL;DR

This paper argues that FRL evaluation has drifted toward optimizing a single proxy task, which encourages overfitting and poor transfer to unseen tasks. It introduces TransFair, a benchmark with transfer-task–extended datasets and a protocol designed to measure FRL performance across multiple, diverse downstream tasks. Through TransFair-based experiments, the authors demonstrate that proxy-task–only optimization degrades transferability and that task-agnostic learning signals can improve generalization to weakly related tasks, albeit with tradeoffs on correlated tasks. The work advocates a shift toward evaluating FRL methods by their universal transfer utility, offering practical benchmarks and criteria to guide future dataset construction and method development. This has practical impact by enabling fair representations that remain useful across a broad set of real-world downstream applications.

Abstract

The goal of Fair Representation Learning (FRL) is to mitigate biases in machine learning models by learning data representations that enable high accuracy on downstream tasks while minimizing discrimination based on sensitive attributes. The evaluation of FRL methods in many recent works primarily focuses on the tradeoff between downstream fairness and accuracy with respect to a single task that was used to approximate the utility of representations during training (proxy task). This incentivizes retaining only features relevant to the proxy task while discarding all other information. In extreme cases, this can cause the learned representations to collapse to a trivial, binary value, rendering them unusable in transfer settings. In this work, we argue that this approach is fundamentally mismatched with the original motivation of FRL, which arises from settings with many downstream tasks unknown at training time (transfer tasks). To remedy this, we propose to refocus the evaluation protocol of FRL methods primarily around the performance on transfer tasks. A key challenge when conducting such an evaluation is the lack of adequate benchmarks. We address this by formulating four criteria that a suitable evaluation procedure should fulfill. Based on these, we propose TransFair, a benchmark that satisfies these criteria, consisting of novel variations of popular FRL datasets with carefully calibrated transfer tasks. In this setting, we reevaluate state-of-the-art FRL methods, observing that they often overfit to the proxy task, which causes them to underperform on certain transfer tasks. We further highlight the importance of task-agnostic learning signals for FRL methods, as they can lead to more transferrable representations.
Paper Structure (39 sections, 2 equations, 5 figures, 3 tables)

This paper contains 39 sections, 2 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: Most FRL works conduct their main evaluation by measuring the predictive performance of their learned representations on a proxy task $y_p$ that has also often been used to train the representations $z_i$. However, such an approach does not provide any insight into the downstream performance of the representations on other tasks. In fact, two representations $z_1$ and $z_2$, retaining vastly different amounts of information from the original data $x$, could still be indistinguishable under this evaluation method by being equally predictive of $y_p$. We introduce the TransFair benchmark to facilitate transfer evaluations on other tasks $y_t$, allowing one to successfully identify richer representations.
  • Figure 2: Accuracy-fairness Pareto fronts achieved by different FRL methods that rely solely on the proxy label $y_p$ to evaluate utility during training, on ACS-Transfer (top) and Heritage-Health-Transfer (bottom). Transfer tasks are sorted by decreasing correlation with $y_p$, shown as SMC in parentheses. The area shaded in red indicates representations with higher unfairness than the unfair baseline.
  • Figure 3: TransFair Pareto fronts achieved by FARE, sIPM-LFR and CVIB, with and without the reconstruction loss. Top: Results on ACS-Transfer, Bottom: Results on Heritage-Health-Transfer.
  • Figure 4: As can be seen, it is possible to obtain a reasonable fairness-utility tradeoff for all labels in principle. For the purpose of demonstrating this, we train FARE (Eval) directly using the transfer label in question during training and compare this to FARE (Proxy) that has been trained on the proxy task $y_p$.
  • Figure 5: Comparison of FARE with the reconstruction loss based on the mean squared distance (FARE (Rec)) and the absolute distance (FARE (Rec Abs)) on all transfer tasks. The top row shows the results on ACS-Transfer, while the bottom row shows the results on Heritage-Health-Transfer.