Table of Contents
Fetching ...

Long-Term Fair Decision Making through Deep Generative Models

Yaowei Hu, Yongkai Wu, Lu Zhang

TL;DR

This work tackles long-term fairness in sequential decision-making by modeling dynamics with a temporal causal graph and deploying soft interventions to capture deployment effects. It introduces a 1-Wasserstein-based metric J1^T(θ) to quantify disparities between demographic groups at a future time, and shows how minimizing this distance can reconcile DP and EO under Lipschitz assumptions. The authors propose DeepLF, a three-phase framework consisting of a base predictor (Phase 1), a recurrent conditional GAN (Phase 2) to generate high-fidelity observational and interventional data, and a performative-risk-optimized long-term fair model (Phase 3). Empirical results on synthetic and semi-synthetic datasets demonstrate that DeepLF achieves a better balance between long-term fairness, local fairness, and predictive utility than baselines, highlighting its potential for fair, dynamic decision-making in real-world settings where sensitive attributes are limited. The approach advances practical long-horizon fairness by leveraging causal structure and data-driven generative modeling to counteract feedback loops and distribution shifts.

Abstract

This paper studies long-term fair machine learning which aims to mitigate group disparity over the long term in sequential decision-making systems. To define long-term fairness, we leverage the temporal causal graph and use the 1-Wasserstein distance between the interventional distributions of different demographic groups at a sufficiently large time step as the quantitative metric. Then, we propose a three-phase learning framework where the decision model is trained on high-fidelity data generated by a deep generative model. We formulate the optimization problem as a performative risk minimization and adopt the repeated gradient descent algorithm for learning. The empirical evaluation shows the efficacy of the proposed method using both synthetic and semi-synthetic datasets.

Long-Term Fair Decision Making through Deep Generative Models

TL;DR

This work tackles long-term fairness in sequential decision-making by modeling dynamics with a temporal causal graph and deploying soft interventions to capture deployment effects. It introduces a 1-Wasserstein-based metric J1^T(θ) to quantify disparities between demographic groups at a future time, and shows how minimizing this distance can reconcile DP and EO under Lipschitz assumptions. The authors propose DeepLF, a three-phase framework consisting of a base predictor (Phase 1), a recurrent conditional GAN (Phase 2) to generate high-fidelity observational and interventional data, and a performative-risk-optimized long-term fair model (Phase 3). Empirical results on synthetic and semi-synthetic datasets demonstrate that DeepLF achieves a better balance between long-term fairness, local fairness, and predictive utility than baselines, highlighting its potential for fair, dynamic decision-making in real-world settings where sensitive attributes are limited. The approach advances practical long-horizon fairness by leveraging causal structure and data-driven generative modeling to counteract feedback loops and distribution shifts.

Abstract

This paper studies long-term fair machine learning which aims to mitigate group disparity over the long term in sequential decision-making systems. To define long-term fairness, we leverage the temporal causal graph and use the 1-Wasserstein distance between the interventional distributions of different demographic groups at a sufficiently large time step as the quantitative metric. Then, we propose a three-phase learning framework where the decision model is trained on high-fidelity data generated by a deep generative model. We formulate the optimization problem as a performative risk minimization and adopt the repeated gradient descent algorithm for learning. The empirical evaluation shows the efficacy of the proposed method using both synthetic and semi-synthetic datasets.
Paper Structure (16 sections, 1 theorem, 10 equations, 5 figures, 1 algorithm)

This paper contains 16 sections, 1 theorem, 10 equations, 5 figures, 1 algorithm.

Key Result

Proposition 1

Let $d$ be the 1-Wasserstein distance given in Definition def:lf. For any sensitive attribute-unconscious decision model $f: \mathcal{X} \mapsto \mathcal{A}$ that is Lipschitz continuous, its DP is bounded by $l_f \cdot d$ where $l_f$ is the Lipschitz constant of $f$. If we assume that the true labe

Figures (5)

  • Figure 1: A temporal causal graph for sequential decision making.
  • Figure 2: The overview of the proposed framework. Solid arrows represent input, and the dashed arrow represents parameter sharing. For Phase 3 only one generator is shown.
  • Figure 3: The architecture of the RCGAN.
  • Figure 4: Accuracy ($\uparrow$), local and long-term unfairness ($\downarrow$) of different algorithms on SimLoan ((a) and (b)) and Taiwan ((c) and (d)) datasets. The decision models are trained on generated data within the time range $[1, 10]$. (a) and (c): Results of evaluation on generated data within time range $[1, 10]$. (b) and (d): Results of evaluation on generated data within the time range $[10, 19]$.
  • Figure 5: T-SNE of generated data distributions at time step $t=10$ produced by MLP (left) and DeepLF (right).

Theorems & Definitions (4)

  • Definition 1
  • Proposition 1
  • Definition 2
  • Definition 3