Demographic parity in regression and classification within the unawareness framework

Vincent Divol; Solenne Gaucher

Demographic parity in regression and classification within the unawareness framework

Vincent Divol, Solenne Gaucher

TL;DR

It is demonstrated that nestedness of the decision sets of the classifiers is both necessary and sufficient to establish a form of equivalence between classification and regression, and the connection between optimal fair cost-sensitive classification, and optimal fair regression.

Abstract

This paper explores the theoretical foundations of fair regression under the constraint of demographic parity within the unawareness framework, where disparate treatment is prohibited, extending existing results where such treatment is permitted. Specifically, we aim to characterize the optimal fair regression function when minimizing the quadratic loss. Our results reveal that this function is given by the solution to a barycenter problem with optimal transport costs. Additionally, we study the connection between optimal fair cost-sensitive classification, and optimal fair regression. We demonstrate that nestedness of the decision sets of the classifiers is both necessary and sufficient to establish a form of equivalence between classification and regression. Under this nestedness assumption, the optimal classifiers can be derived by applying thresholds to the optimal fair regression function; conversely, the optimal fair regression function is characterized by the family of cost-sensitive classifiers.

Demographic parity in regression and classification within the unawareness framework

TL;DR

Abstract

Paper Structure (29 sections, 24 theorems, 129 equations, 2 figures)

This paper contains 29 sections, 24 theorems, 129 equations, 2 figures.

Introduction
Motivation
Problem statement
Notation
Related work
Fair classification
Fair regression
Outline and contribution
A short introduction to optimal transport
The optimal transport problem
Multi-to-one dimensional optimal transport
Fair regression and the barycenter problem
Fair regression functions do not preserve order
Reduction to an optimal transport problem
Reformulation of the demographic parity constraint
...and 14 more sections

Key Result

Theorem 1

Assume that for all $s \in \mathcal{S}$, the distribution $\nu_{s}$ of $\eta(X,S)$ for $S=s$ has no atoms, and let $p_s = \mathbb{P}(S = s)$. Then, where $\mathcal{W}_2^2(\nu_s, \nu)$ is the squared Wasserstein distance between $\nu_s$ and $\nu$. Moreover, if $f^*$ and $\nu$ solve the left-hand side and the right-hand side problems respectively, then $\nu$ is equal to the distribution of $f^*(X,S

Figures (2)

Figure 1: The measure $\boldsymbol{\mu_+}$ is displayed in red and the measure $\boldsymbol{\mu_-}$ is displayed in blue. By definition of $\kappa^+(y)$ and $\kappa^-(y)$, the red region and the blue region to the right of the two dotted lines have equal masses. The region in between the two lines contains no mass.
Figure 2: Left: example of a nested problem. The distributions of $\boldsymbol{\mu_+}$ and $\boldsymbol{\mu_-}$ are depicted in red and blue, corresponding to the distributions given in Example \ref{['ex:1']}. The acceptance region for $g_y^{\kappa(y)}$ and $g_{y'}^{\kappa'(y)}$ are so that the masses of $\boldsymbol{\mu_+}$ and $\boldsymbol{\mu_-}$ to the right of the decision boundaries are equal. One can observe that these two regions are nested. Right: example of a non-nested problem. The distributions $\boldsymbol{\mu_+}$ and $\boldsymbol{\mu_-}$ are the ones described in Example \ref{['ex:2']}. The region in pink is rejected for $y=-3$ but accepted for $y=0$, contradicting the nestedness assumption.

Theorems & Definitions (46)

Definition 1: Demographic parity
Definition 2: Fair regression
Definition 3: Fair classification
Theorem 1: Chzhen2020AMFgouic2020projection
Theorem 2: Informal
Theorem 3: Informal
Definition 4: Order preservation in regression - unawareness framework
Proposition 1
proof
Lemma 1
...and 36 more

Demographic parity in regression and classification within the unawareness framework

TL;DR

Abstract

Demographic parity in regression and classification within the unawareness framework

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (46)