Table of Contents
Fetching ...

Fair Domain Generalization: An Information-Theoretic View

Tangzheng Lian, Guanyu Hu, Dimitrios Kollias, Xinyu Yang, Oya Celiktutan

TL;DR

This work tackles Fair Domain Generalization (FairDG), seeking to minimize both risk on unseen target domains and fairness violations under domain shifts. It provides novel mutual-information–based upper bounds for target risk and Equalized Odds violations, and then develops a practical Pareto-optimized framework (PAFDG) that learns domain- and group-invariant representations. The method uses differentiable dependence measures (distance correlation) on learned encodings and trains via a two-stage process to yield a Pareto front of utility–fairness trade-offs, with an efficient lambda-conditioned training strategy. Experiments on CelebA, AffectNet, and Jigsaw demonstrate superior Pareto fronts and single-solution performance compared to DG and fairness baselines, highlighting scalability to multi-class, multi-group fairness under distribution shifts.

Abstract

Domain generalization (DG) and algorithmic fairness are two critical challenges in machine learning. However, most DG methods focus only on minimizing expected risk in the unseen target domain without considering algorithmic fairness. Conversely, fairness methods typically do not account for domain shifts, so the fairness achieved during training may not generalize to unseen test domains. In this work, we bridge these gaps by studying the problem of Fair Domain Generalization (FairDG), which aims to minimize both expected risk and fairness violations in unseen target domains. We derive novel mutual information-based upper bounds for expected risk and fairness violations in multi-class classification tasks with multi-group sensitive attributes. These bounds provide key insights for algorithm design from an information-theoretic perspective. Guided by these insights, we introduce PAFDG (Pareto-Optimal Fairness for Domain Generalization), a practical framework that solves the FairDG problem and models the utility-fairness trade-off through Pareto optimization. Experiments on real-world vision and language datasets show that PAFDG achieves superior utility-fairness trade-offs compared to existing methods.

Fair Domain Generalization: An Information-Theoretic View

TL;DR

This work tackles Fair Domain Generalization (FairDG), seeking to minimize both risk on unseen target domains and fairness violations under domain shifts. It provides novel mutual-information–based upper bounds for target risk and Equalized Odds violations, and then develops a practical Pareto-optimized framework (PAFDG) that learns domain- and group-invariant representations. The method uses differentiable dependence measures (distance correlation) on learned encodings and trains via a two-stage process to yield a Pareto front of utility–fairness trade-offs, with an efficient lambda-conditioned training strategy. Experiments on CelebA, AffectNet, and Jigsaw demonstrate superior Pareto fronts and single-solution performance compared to DG and fairness baselines, highlighting scalability to multi-class, multi-group fairness under distribution shifts.

Abstract

Domain generalization (DG) and algorithmic fairness are two critical challenges in machine learning. However, most DG methods focus only on minimizing expected risk in the unseen target domain without considering algorithmic fairness. Conversely, fairness methods typically do not account for domain shifts, so the fairness achieved during training may not generalize to unseen test domains. In this work, we bridge these gaps by studying the problem of Fair Domain Generalization (FairDG), which aims to minimize both expected risk and fairness violations in unseen target domains. We derive novel mutual information-based upper bounds for expected risk and fairness violations in multi-class classification tasks with multi-group sensitive attributes. These bounds provide key insights for algorithm design from an information-theoretic perspective. Guided by these insights, we introduce PAFDG (Pareto-Optimal Fairness for Domain Generalization), a practical framework that solves the FairDG problem and models the utility-fairness trade-off through Pareto optimization. Experiments on real-world vision and language datasets show that PAFDG achieves superior utility-fairness trade-offs compared to existing methods.

Paper Structure

This paper contains 10 sections, 122 equations, 4 figures, 10 tables.

Figures (4)

  • Figure 1: A real-world example of the FairDG problem. The goal is to train a model that generalizes to an unseen domain (age group: 0-9) while also ensuring fairness by minimizing performance disparities across perceived racial groups.
  • Figure 2: Our proposed method. $\hat{f}_{\theta_D}$ and $\hat{f}_{\theta_G}$ are trained in stage 1 and frozen to produce $Z_D$ and $Z_G$, which guide $\hat{f}_{\theta_E}$ to learn fair and domain-invariant $Z_D$ by minimizing two dependence terms, where $\lambda$ controls the utility–fairness trade-off and $\gamma$ adjusts the source domain invariance.
  • Figure 3: Visualization of Pareto fronts ($\mathcal{P_{\text{norm}}}$) for the fairness-only and FairDG methods. The method with the highest HVI is visualized (the shaded area). Please refer to the Appendix Gfor the exact HVI values corresponding to this figure.
  • Figure 4: EOD Violation and Accuracy with respect to the trade-off coefficient $\lambda$ (CelebA dataset).