Table of Contents
Fetching ...

Supervised Algorithmic Fairness in Distribution Shifts: A Survey

Minglai Shao, Dong Li, Chen Zhao, Xintao Wu, Yujie Lin, Qin Tian

TL;DR

This survey addresses how to maintain fair predictions when supervised models encounter distribution shifts between source and target domains. It introduces a taxonomy of shift types (covariate, label, concept, demographic, dependence, and hybrids) and groups existing methods into six families: feature disentanglement, data augmentation, causal inference, reweighting, robust optimization, and regularization-based approaches. It catalogs publicly available datasets across tabular, image, text, and graph domains and reviews evaluation metrics for fairness under shifts, including DP, EO, EOp, consistency, and counterfactual measures. The work highlights challenges in fairness generalization under non-stationary environments and outlines directions for future research to improve robustness and real-world deployment of equitable ML systems.

Abstract

Supervised fairness-aware machine learning under distribution shifts is an emerging field that addresses the challenge of maintaining equitable and unbiased predictions when faced with changes in data distributions from source to target domains. In real-world applications, machine learning models are often trained on a specific dataset but deployed in environments where the data distribution may shift over time due to various factors. This shift can lead to unfair predictions, disproportionately affecting certain groups characterized by sensitive attributes, such as race and gender. In this survey, we provide a summary of various types of distribution shifts and comprehensively investigate existing methods based on these shifts, highlighting six commonly used approaches in the literature. Additionally, this survey lists publicly available datasets and evaluation metrics for empirical studies. We further explore the interconnection with related research fields, discuss the significant challenges, and identify potential directions for future studies.

Supervised Algorithmic Fairness in Distribution Shifts: A Survey

TL;DR

This survey addresses how to maintain fair predictions when supervised models encounter distribution shifts between source and target domains. It introduces a taxonomy of shift types (covariate, label, concept, demographic, dependence, and hybrids) and groups existing methods into six families: feature disentanglement, data augmentation, causal inference, reweighting, robust optimization, and regularization-based approaches. It catalogs publicly available datasets across tabular, image, text, and graph domains and reviews evaluation metrics for fairness under shifts, including DP, EO, EOp, consistency, and counterfactual measures. The work highlights challenges in fairness generalization under non-stationary environments and outlines directions for future research to improve robustness and real-world deployment of equitable ML systems.

Abstract

Supervised fairness-aware machine learning under distribution shifts is an emerging field that addresses the challenge of maintaining equitable and unbiased predictions when faced with changes in data distributions from source to target domains. In real-world applications, machine learning models are often trained on a specific dataset but deployed in environments where the data distribution may shift over time due to various factors. This shift can lead to unfair predictions, disproportionately affecting certain groups characterized by sensitive attributes, such as race and gender. In this survey, we provide a summary of various types of distribution shifts and comprehensively investigate existing methods based on these shifts, highlighting six commonly used approaches in the literature. Additionally, this survey lists publicly available datasets and evaluation metrics for empirical studies. We further explore the interconnection with related research fields, discuss the significant challenges, and identify potential directions for future studies.
Paper Structure (12 sections, 8 equations, 1 figure, 3 tables)

This paper contains 12 sections, 8 equations, 1 figure, 3 tables.

Figures (1)

  • Figure 1: An illustration of fairness-aware machine learning under various distribution shifts. We consider $S=1$ and $\mathbf{x}=[x_1,x_2]^T$ as a simple example of a two-dimensional feature vector. (Left) A fair classifier $f_{\boldsymbol{\theta}}$ is learned using data sampled from a source domain. (Right) The learned $f_{\boldsymbol{\theta}}$ is applied to data sampled from various types of shifted target domains, resulting in misclassification and unfairness. $f_{\boldsymbol{\theta}}^*$ represents the true classifier in the target domain.