Table of Contents
Fetching ...

Liouville PDE-based sliced-Wasserstein flow for fair regression

Pilhwa Lee, Jayshawn Cooper

TL;DR

The paper tackles fair regression under strong demographic parity by framing distribution alignment as a Wasserstein barycenter problem. It introduces Liouville PDE-based sliced-Wasserstein flows (without diffusion) and approximates the Wasserstein barycenter via Kantorovich potentials, coupled with neural-ODE-based density estimation to reduce variance. Empirical results on synthetic transport tasks and crime/health spending datasets show improved convergence and favorable accuracy-fairness tradeoffs, especially in high-dimensional scenarios, while remaining computationally efficient relative to exact barycenters. This work provides a scalable, nonparametric approach to fairness-aware regression via dynamic optimal transport.

Abstract

The sliced Wasserstein flow (SWF), a nonparametric and implicit generative gradient flow, is applied to fair regression. We have improved the SWF in a few aspects. First, the stochastic diffusive term from the Fokker-Planck equation-based Monte Carlo is transformed to Liouville partial differential equation (PDE)-based transport with density estimation, however, without the diffusive term. Now, the computation of the Wasserstein barycenter is approximated by the SWF barycenter with the prescription of Kantorovich potentials for the induced gradient flow to generate its samples. These two efforts improve the convergence in training and testing SWF and SWF barycenters with reduced variance. Applying the generative SWF barycenter for fair regression demonstrates competent profiles in the accuracy-fairness Pareto curves.

Liouville PDE-based sliced-Wasserstein flow for fair regression

TL;DR

The paper tackles fair regression under strong demographic parity by framing distribution alignment as a Wasserstein barycenter problem. It introduces Liouville PDE-based sliced-Wasserstein flows (without diffusion) and approximates the Wasserstein barycenter via Kantorovich potentials, coupled with neural-ODE-based density estimation to reduce variance. Empirical results on synthetic transport tasks and crime/health spending datasets show improved convergence and favorable accuracy-fairness tradeoffs, especially in high-dimensional scenarios, while remaining computationally efficient relative to exact barycenters. This work provides a scalable, nonparametric approach to fairness-aware regression via dynamic optimal transport.

Abstract

The sliced Wasserstein flow (SWF), a nonparametric and implicit generative gradient flow, is applied to fair regression. We have improved the SWF in a few aspects. First, the stochastic diffusive term from the Fokker-Planck equation-based Monte Carlo is transformed to Liouville partial differential equation (PDE)-based transport with density estimation, however, without the diffusive term. Now, the computation of the Wasserstein barycenter is approximated by the SWF barycenter with the prescription of Kantorovich potentials for the induced gradient flow to generate its samples. These two efforts improve the convergence in training and testing SWF and SWF barycenters with reduced variance. Applying the generative SWF barycenter for fair regression demonstrates competent profiles in the accuracy-fairness Pareto curves.

Paper Structure

This paper contains 21 sections, 6 theorems, 65 equations, 5 figures, 3 tables, 2 algorithms.

Key Result

Theorem 1

Assume, for each $s \in S$, that the measure $\nu_{f^*|s}$ has a density and let $p_s = \mathbb{P}(S=s)$. Then,

Figures (5)

  • Figure 1: Sliced Wasserstein flows (SWF) (a) and (b) Training and testing the optimal transport to the defined target distribution. Two approaches are compared: the original SWF liutkus2019 and the proposed Liouville PDE-based SWF. (c)-(e) The target distribution (shaded countour plot) and the initial, intermediate, and distribution of particles (isopotential lines) in the converging Liouville PDE-based SWF.
  • Figure 2: Sliced Wasserstein barycenter flows (SWF barycenter).
  • Figure 3: Pareto curve: accuracy (MSE) and fairness (KS score) are paired from the community crime and health care spending cost datasets.
  • Figure S1: Convergence in training the sliced-Wasserstein barycenters in the community crimes and health care spending datasets: dependency on the regularization of $\lambda$.
  • Figure S2: Convergence in training and testing of the community crimes (a, b) and health care spending (c, d). The convergence of the SWF and Liouville PDE-based SWF is compared.

Theorems & Definitions (13)

  • Definition 1: Demographic Parity
  • Theorem 1: Chzhen, et al. 2020
  • Theorem 2: liutkus2019
  • Definition 2: Mean Squared Error in Fair Regression
  • Definition 3: Empirical Kolmogorov-Smirnov (KS) distance
  • Theorem 3
  • Theorem S1
  • proof
  • Definition S1.: Minimizing movement scheme
  • Theorem S2
  • ...and 3 more