Table of Contents
Fetching ...

Distributionally Fair Stochastic Optimization using Wasserstein Distance

Qing Ye, Grani A. Hanasusanto, Weijun Xie

TL;DR

The concept of Distributionally Fair Stochastic Optimization (DFSO) based on the Wasserstein fairness measure is introduced and the exactness of the Gelbrich bound is established and the theoretical difference between the Wasserstein fairness measure and the Gelbrich bound is quantified.

Abstract

A traditional stochastic program under a finite population typically seeks to optimize efficiency by maximizing the expected profits or minimizing the expected costs, subject to a set of constraints. However, implementing such optimization-based decisions can have varying impacts on individuals, and when assessed using the individuals' utility functions, these impacts may differ substantially across demographic groups delineated by sensitive attributes, such as gender, race, age, and socioeconomic status. As each group comprises multiple individuals, a common remedy is to enforce group fairness, which necessitates the measurement of disparities in the distributions of utilities across different groups. This paper introduces the concept of Distributionally Fair Stochastic Optimization (DFSO) based on the Wasserstein fairness measure. The DFSO aims to minimize distributional disparities among groups, quantified by the Wasserstein distance, while adhering to an acceptable level of inefficiency. Our analysis reveals that: (i) the Wasserstein fairness measure recovers the demographic parity fairness prevalent in binary classification literature; (ii) this measure can approximate the well-known Kolmogorov-Smirnov fairness measure with considerable accuracy; and (iii) despite DFSO's biconvex nature, the epigraph of the Wasserstein fairness measure is generally Mixed-Integer Convex Programming Representable (MICP-R). Additionally, we introduce two distinct lower bounds for the Wasserstein fairness measure: the Jensen bound, applicable to the general Wasserstein fairness measure, and the Gelbrich bound, specific to the type-2 Wasserstein fairness measure. We establish the exactness of the Gelbrich bound and quantify the theoretical difference between the Wasserstein fairness measure and the Gelbrich bound.

Distributionally Fair Stochastic Optimization using Wasserstein Distance

TL;DR

The concept of Distributionally Fair Stochastic Optimization (DFSO) based on the Wasserstein fairness measure is introduced and the exactness of the Gelbrich bound is established and the theoretical difference between the Wasserstein fairness measure and the Gelbrich bound is quantified.

Abstract

A traditional stochastic program under a finite population typically seeks to optimize efficiency by maximizing the expected profits or minimizing the expected costs, subject to a set of constraints. However, implementing such optimization-based decisions can have varying impacts on individuals, and when assessed using the individuals' utility functions, these impacts may differ substantially across demographic groups delineated by sensitive attributes, such as gender, race, age, and socioeconomic status. As each group comprises multiple individuals, a common remedy is to enforce group fairness, which necessitates the measurement of disparities in the distributions of utilities across different groups. This paper introduces the concept of Distributionally Fair Stochastic Optimization (DFSO) based on the Wasserstein fairness measure. The DFSO aims to minimize distributional disparities among groups, quantified by the Wasserstein distance, while adhering to an acceptable level of inefficiency. Our analysis reveals that: (i) the Wasserstein fairness measure recovers the demographic parity fairness prevalent in binary classification literature; (ii) this measure can approximate the well-known Kolmogorov-Smirnov fairness measure with considerable accuracy; and (iii) despite DFSO's biconvex nature, the epigraph of the Wasserstein fairness measure is generally Mixed-Integer Convex Programming Representable (MICP-R). Additionally, we introduce two distinct lower bounds for the Wasserstein fairness measure: the Jensen bound, applicable to the general Wasserstein fairness measure, and the Gelbrich bound, specific to the type-2 Wasserstein fairness measure. We establish the exactness of the Gelbrich bound and quantify the theoretical difference between the Wasserstein fairness measure and the Gelbrich bound.
Paper Structure (45 sections, 40 theorems, 128 equations, 4 figures, 7 tables)

This paper contains 45 sections, 40 theorems, 128 equations, 4 figures, 7 tables.

Key Result

Lemma 1

For any $a<\bar{a}\in A$ and a fixed decision $\bm{x}$, the Wasserstein distance $W_q\left({\mathbb{P}}_{f(\bm{x},\tilde{\bm\xi}_a)},{\mathbb{P}}_{f(\bm{x},\tilde{\bm\xi}_{\bar{a}})}\right)$ can be expressed as where $F^{-1}_a$ is the inverse distribution function of the random function $f(\bm{x},\tilde{\bm\xi}_a)$ for each $a\in A$. When $q=1$, the type-1 Wasserstein distance $W_1\left({\mathbb{

Figures (4)

  • Figure 1: Gap between AM and the two lower bounds for a large population size. The mean and standard deviation of the gap over 10 replications, as well as the average running time of each method, are illustrated.
  • Figure 2: Fairness vs MSE for Fair Regression. The Wasserstein fairness versus MSE are shown in (a), (c), (e), (g), and Kolmogorov–Smirnov fairness versus MSE are shown in (b), (d), (f), (h). All the training and testing results are averaged over $10$ replications.
  • Figure 3: Histograms of Utility for Fair Allocation of COVID-19 Vaccine in GA
  • Figure 4: Histograms of Utility for Fair Knapsack

Theorems & Definitions (47)

  • Lemma 1: Proposition 2.17 in santambrogio2015optimal
  • Definition 1
  • Proposition 1
  • Theorem 1
  • Definition 2
  • Proposition 2
  • Definition 3: Kolmogorov–Smirnov Fairness Measure, agarwal2019fair
  • Definition 4: Breaking Points
  • Proposition 3
  • Definition 5: Theorem 4.1 in lubin2022mixed
  • ...and 37 more