Regression under demographic parity constraints via unlabeled post-processing

Evgenii Chzhen; Mohamed Hebiri; Gayane Taturyan

Regression under demographic parity constraints via unlabeled post-processing

Evgenii Chzhen, Mohamed Hebiri, Gayane Taturyan

TL;DR

A general-purpose post-processing algorithm that, using accurate estimates of the regression function and a sensitive attribute predictor, generates predictions that meet the demographic parity constraint, involves discretization and stochastic minimization of a smooth convex function.

Abstract

We address the problem of performing regression while ensuring demographic parity, even without access to sensitive attributes during inference. We present a general-purpose post-processing algorithm that, using accurate estimates of the regression function and a sensitive attribute predictor, generates predictions that meet the demographic parity constraint. Our method involves discretization and stochastic minimization of a smooth convex function. It is suitable for online post-processing and multi-class classification tasks only involving unlabeled data for the post-processing. Unlike prior methods, our approach is fully theory-driven. We require precise control over the gradient norm of the convex function, and thus, we rely on more advanced techniques than standard stochastic gradient descent. Our algorithm is backed by finite-sample analysis and post-processing bounds, with experimental results validating our theoretical findings.

Regression under demographic parity constraints via unlabeled post-processing

TL;DR

Abstract

Paper Structure (36 sections, 22 theorems, 127 equations, 5 figures, 1 table, 4 algorithms)

This paper contains 36 sections, 22 theorems, 127 equations, 5 figures, 1 table, 4 algorithms.

Introduction
Contributions
Organization.
Notation.
Problem setup
Our methodology
Introducing discretization.
Properties of $F$ and $\pi_{{\mathbf{\Lambda}}^\star,{\mathbf{V}}^\star}$.
Gradient of $F$ is crucial.
Summary of our approach and why is it different from others.
Proposed algorithm
Theoretical guarantees.
Extension to unknown $\eta$ and $\tau$.
Numerical illustration
Comparison with agarwal2019fair.
...and 21 more sections

Key Result

Lemma 3.1

Let $L \in \mathbb{N}$ and $\beta > 0$. Let ${\mathbf{\Lambda}}^\star = (\lambda^\star_{\ell s})_{\ell \in [\![ L ]\!], s \in [K]}$ and ${\mathbf{V}}^\star = (\nu^\star_{\ell s})_{\ell \in [\![ L ]\!], s \in [K]}$ be two matrices that are solutions to where $\boldsymbol{t}({\boldsymbol x}) \stackrel{\hbox{\scriptsize \rm def}}{=} 1-\frac{\boldsymbol{\tau}({\boldsymbol x})}{{\boldsymbol p}}$, $r_\

Figures (5)

Figure 1: Risk and unfairness of our estimator on Communities and Crime and Law School datasets.
Figure 2: Comparison with ADW model on Communitites and Crime and Law School datasets.
Figure 3: Comparison of SDG, ACSA, ACSA2, SDG3+ACSA and SDG3+ACSA2 algorithms on Communitites and Crime and Law School datasets.
Figure 4: Experiment on Adult dataset: risk convergence, unfairness convergence and comparison with ADW.
Figure 5: The distributions of the (scaled) predictions of the fair and base models.

Theorems & Definitions (45)

Remark 2.1
Remark 2.2
Remark 3.1: On abuse of notation
Lemma 3.1
Lemma 3.2: Fairness quantification
Lemma 3.3: Risk gain
Lemma 3.4: Regularity of $F$
Lemma 3.5
Remark 3.2: On the dynamic of algorithm
Lemma 4.1
...and 35 more

Regression under demographic parity constraints via unlabeled post-processing

TL;DR

Abstract

Regression under demographic parity constraints via unlabeled post-processing

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (45)