ProxiMix: Enhancing Fairness with Proximity Samples in Subgroups

Jingyu Hu; Jun Hong; Mengnan Du; Weiru Liu

ProxiMix: Enhancing Fairness with Proximity Samples in Subgroups

Jingyu Hu, Jun Hong, Mengnan Du, Weiru Liu

TL;DR

This paper proposes ProxiMix which keeps both pairwise and proximity relationships for fairer data augmentation and shows the effectiveness of ProxiMix from both fairness of predictions and fairness of recourse perspectives.

Abstract

Many bias mitigation methods have been developed for addressing fairness issues in machine learning. We found that using linear mixup alone, a data augmentation technique, for bias mitigation, can still retain biases present in dataset labels. Research presented in this paper aims to address this issue by proposing a novel pre-processing strategy in which both an existing mixup method and our new bias mitigation algorithm can be utilized to improve the generation of labels of augmented samples, which are proximity aware. Specifically, we proposed ProxiMix which keeps both pairwise and proximity relationships for fairer data augmentation. We conducted thorough experiments with three datasets, three ML models, and different hyperparameters settings. Our experimental results showed the effectiveness of ProxiMix from both fairness of predictions and fairness of recourse perspectives.

ProxiMix: Enhancing Fairness with Proximity Samples in Subgroups

TL;DR

Abstract

Paper Structure (18 sections, 7 equations, 5 figures, 11 tables, 1 algorithm)

This paper contains 18 sections, 7 equations, 5 figures, 11 tables, 1 algorithm.

Introduction
Related Work
Preliminaries and Problem Statement
Methodology and Experiment Design
ProxiMix Algorithm
Experiment Setting
Results
Sampling Mode Preferences in ProxiMix with Fixed Balancing Degree
The Impact of Balancing Degree in ProxiMix
Counterfactual Cost across Different Groups
Conclusion
Appendices: Dataset Description
Adult Income Dataset
Law School Dataset
Credit Default Dataset
...and 3 more sections

Figures (5)

Figure 1: A comparison between proximity-based mixup and linear mixup. The red circle represents $S_0$, the blue triangle represents $S_1$, the purple diamonds represent the proximity set $D_p$, and the black square indicates the samples after mixing up. Here, we consider the particular case for case three where labels of most proximity samples are opposite to $S_1$. The mixing ratio is set to $0.5$.
Figure 2: An Example of ProxiMix with Balancing $d=[1,0.8,0.5,0.2,0], \lambda=0.5$
Figure 3: The Experiment Workflow
Figure 4: The fairness performance changes under different balancing degree $d$ in Credit Default dataset under MLP model (fTPR: TPR in female group, mTPF: TPR in male group, $d = [0, 0.2, 0.5, 0.7, 0.8, 1]$, refer to Appendix \ref{['app:credit_appendix']} for detailed results)
Figure 5: The fairness performance changes under different balancing degree $d$ in the Adult dataset under MLP model (fTPR: TPR in female group, mTPF: TPR in male group, d = [0, 0.2, 0.5, 0.7, 0.8, 1], refer to Appendix \ref{['ap:adult_appendix']} for detailed results)

ProxiMix: Enhancing Fairness with Proximity Samples in Subgroups

TL;DR

Abstract

ProxiMix: Enhancing Fairness with Proximity Samples in Subgroups

Authors

TL;DR

Abstract

Table of Contents

Figures (5)