Rethinking Fair Representation Learning for Performance-Sensitive Tasks
Charles Jones, Fabio de Sousa Ribeiro, Mélanie Roschewitz, Daniel C. Castro, Ben Glocker
TL;DR
This work interrogates the validity of fair representation learning (FRL) for performance-sensitive tasks, using a causal framework to expose implicit assumptions and limitations when training and test data share the same distribution versus under distribution shift. It unifies the fairness literature into three paradigms—group parity, iid performance optimization, and unbiased distribution generalization—and develops a formal causal account of dataset bias with X decomposed into task-related X_Z and sensitive X_A features. The authors prove fundamental limitations of FRL in iid settings and propose two hypotheses for its potential validity under distribution shifts, supported by extensive experiments across medical imaging modalities that show FRL’s benefits are conditional on bias structure and subgroup separability. The results urge explicit bias analysis and careful, domain-aware evaluation practices for deploying fairness methods in real-world, high-stakes applications. The work provides practical guidance on when FRL may help and when it may harm, highlighting the central role of dataset bias structure and separability in determining a method’s usefulness.
Abstract
We investigate the prominent class of fair representation learning methods for bias mitigation. Using causal reasoning to define and formalise different sources of dataset bias, we reveal important implicit assumptions inherent to these methods. We prove fundamental limitations on fair representation learning when evaluation data is drawn from the same distribution as training data and run experiments across a range of medical modalities to examine the performance of fair representation learning under distribution shifts. Our results explain apparent contradictions in the existing literature and reveal how rarely considered causal and statistical aspects of the underlying data affect the validity of fair representation learning. We raise doubts about current evaluation practices and the applicability of fair representation learning methods in performance-sensitive settings. We argue that fine-grained analysis of dataset biases should play a key role in the field moving forward.
