Table of Contents
Fetching ...

Fair Supervised Learning with A Simple Random Sampler of Sensitive Attributes

Jinwon Sohn, Qifan Song, Guang Lin

TL;DR

This work proposes fairness penalties learned by neural networks with a simple random sampler of sensitive attributes for non-discriminatory supervised learning and builds a computationally efficient group-level in-processing fairness-aware training framework.

Abstract

As the data-driven decision process becomes dominating for industrial applications, fairness-aware machine learning arouses great attention in various areas. This work proposes fairness penalties learned by neural networks with a simple random sampler of sensitive attributes for non-discriminatory supervised learning. In contrast to many existing works that critically rely on the discreteness of sensitive attributes and response variables, the proposed penalty is able to handle versatile formats of the sensitive attributes, so it is more extensively applicable in practice than many existing algorithms. This penalty enables us to build a computationally efficient group-level in-processing fairness-aware training framework. Empirical evidence shows that our framework enjoys better utility and fairness measures on popular benchmark data sets than competing methods. We also theoretically characterize estimation errors and loss of utility of the proposed neural-penalized risk minimization problem.

Fair Supervised Learning with A Simple Random Sampler of Sensitive Attributes

TL;DR

This work proposes fairness penalties learned by neural networks with a simple random sampler of sensitive attributes for non-discriminatory supervised learning and builds a computationally efficient group-level in-processing fairness-aware training framework.

Abstract

As the data-driven decision process becomes dominating for industrial applications, fairness-aware machine learning arouses great attention in various areas. This work proposes fairness penalties learned by neural networks with a simple random sampler of sensitive attributes for non-discriminatory supervised learning. In contrast to many existing works that critically rely on the discreteness of sensitive attributes and response variables, the proposed penalty is able to handle versatile formats of the sensitive attributes, so it is more extensively applicable in practice than many existing algorithms. This penalty enables us to build a computationally efficient group-level in-processing fairness-aware training framework. Empirical evidence shows that our framework enjoys better utility and fairness measures on popular benchmark data sets than competing methods. We also theoretically characterize estimation errors and loss of utility of the proposed neural-penalized risk minimization problem.
Paper Structure (32 sections, 4 theorems, 47 equations, 11 figures, 12 tables, 3 algorithms)

This paper contains 32 sections, 4 theorems, 47 equations, 11 figures, 12 tables, 3 algorithms.

Key Result

Proposition 1

Let $p_{A}$, $p_{h(X)}$, $p_{h(X)|A}$, and $p_{h(X),A}$ be the marginal densities of $A$ and $h(X)$, the conditional density of $h(X)$ given $A$ respectively, and the joint density of $h(X)$ and $A$. Denote $D^* = \underset{D}{\arg}\max~R_F(h;D)$. Then, for all $s\in{\cal S}$ and $a \in {\cal A}$,

Figures (11)

  • Figure 1: (Scenario I) Pareto frontiers: the first row includes pairs of SP and AUC, and the second row shows pairs of KS-GSP and AUC from 5 experiments for each $\lambda$. SBP (ours) tends to be more tightly in the upper-left corner than the competitors.
  • Figure 2: (Scenario I) Pareto frontiers: the first row includes pairs of EO and AUC, and the second row shows pairs of KS-GEO and AUC from 5 experiments for each $\lambda$. Since SBP and CON are similar, they are directly compared in the below figure. SBP and CON illustrate better results than others in Adult and Law School Admission as they are tightly in the upper-left corner.
  • Figure 3: (Scenario II) Pareto frontiers: the first and the second row correspond to Adult, Credit Card Default, and ACSEmployment respectively from 5 experiments for each $\lambda$. SBP is superior to both NEU and HGR overall but comparable to HGR in Adult. Note CON cannot handle statistical parity.
  • Figure 4: (Scenario II) Pareto frontiers: the first and the second row correspond to Adult, Credit Card Default, and ACSEmployment respectively from 5 experiments for each $\lambda$. Remarkably, SBP (ours) outperforms the competitors for the most part. Note KDE cannot handle continuous attributes.
  • Figure 5: (Scenario I of SBP) Pareto frontiers of SP (and KS-GSP) and AUC by differing $\lambda$ from 0.1 to 0.9 for all 5 experiments.
  • ...and 6 more figures

Theorems & Definitions (8)

  • Definition 1: Generalized Statistical Parity (GSP)
  • Proposition 1
  • Definition 2: Generalized Equalized Odds (GEO)
  • Proposition 2
  • Remark 1
  • Definition 3: Rademacher Complexity
  • Theorem 1: Estimation Error
  • Corollary 1: Loss of Utility