Utility-Fairness Trade-Offs and How to Find Them
Sepehr Dehdashtian, Bashir Sadeghi, Vishnu Naresh Boddeti
TL;DR
The paper tackles the core challenge of balancing predictive utility with demographic fairness by introducing two intrinsic trade-offs: Data-Space Trade-Off (DST) and Label-Space Trade-Off (LST). It then presents U-FaTE, a scalable method that numerically quantifies these trade-offs from data via a closed-form, RKHS-based optimization using a universal dependence measure, enabling practical estimation of the trade-offs for various fairness definitions. The framework facilitates evaluating and comparing representations—across zero-shot CLIP and supervised models—by measuring how far they lie from the estimated DST and LST, revealing that many pre-trained models are far from the theoretical limits; in some cases, extra data can push performance beyond the DST. The work provides extensive empirical results on CelebA, FairFace, and FolkTables, demonstrates the stability of DST/LST estimates, and offers a principled tool for understanding and guiding fair representation learning in high-stakes settings. Overall, U-FaTE and the DST/LST concepts advance quantitative understanding of fairness-utility compromises and offer a practical pathway to assess and improve representations with respect to specified fairness criteria.
Abstract
When building classification systems with demographic fairness considerations, there are two objectives to satisfy: 1) maximizing utility for the specific task and 2) ensuring fairness w.r.t. a known demographic attribute. These objectives often compete, so optimizing both can lead to a trade-off between utility and fairness. While existing works acknowledge the trade-offs and study their limits, two questions remain unanswered: 1) What are the optimal trade-offs between utility and fairness? and 2) How can we numerically quantify these trade-offs from data for a desired prediction task and demographic attribute of interest? This paper addresses these questions. We introduce two utility-fairness trade-offs: the Data-Space and Label-Space Trade-off. The trade-offs reveal three regions within the utility-fairness plane, delineating what is fully and partially possible and impossible. We propose U-FaTE, a method to numerically quantify the trade-offs for a given prediction task and group fairness definition from data samples. Based on the trade-offs, we introduce a new scheme for evaluating representations. An extensive evaluation of fair representation learning methods and representations from over 1000 pre-trained models revealed that most current approaches are far from the estimated and achievable fairness-utility trade-offs across multiple datasets and prediction tasks.
