Table of Contents
Fetching ...

ProxiMix: Enhancing Fairness with Proximity Samples in Subgroups

Jingyu Hu, Jun Hong, Mengnan Du, Weiru Liu

TL;DR

This paper proposes ProxiMix which keeps both pairwise and proximity relationships for fairer data augmentation and shows the effectiveness of ProxiMix from both fairness of predictions and fairness of recourse perspectives.

Abstract

Many bias mitigation methods have been developed for addressing fairness issues in machine learning. We found that using linear mixup alone, a data augmentation technique, for bias mitigation, can still retain biases present in dataset labels. Research presented in this paper aims to address this issue by proposing a novel pre-processing strategy in which both an existing mixup method and our new bias mitigation algorithm can be utilized to improve the generation of labels of augmented samples, which are proximity aware. Specifically, we proposed ProxiMix which keeps both pairwise and proximity relationships for fairer data augmentation. We conducted thorough experiments with three datasets, three ML models, and different hyperparameters settings. Our experimental results showed the effectiveness of ProxiMix from both fairness of predictions and fairness of recourse perspectives.

ProxiMix: Enhancing Fairness with Proximity Samples in Subgroups

TL;DR

This paper proposes ProxiMix which keeps both pairwise and proximity relationships for fairer data augmentation and shows the effectiveness of ProxiMix from both fairness of predictions and fairness of recourse perspectives.

Abstract

Many bias mitigation methods have been developed for addressing fairness issues in machine learning. We found that using linear mixup alone, a data augmentation technique, for bias mitigation, can still retain biases present in dataset labels. Research presented in this paper aims to address this issue by proposing a novel pre-processing strategy in which both an existing mixup method and our new bias mitigation algorithm can be utilized to improve the generation of labels of augmented samples, which are proximity aware. Specifically, we proposed ProxiMix which keeps both pairwise and proximity relationships for fairer data augmentation. We conducted thorough experiments with three datasets, three ML models, and different hyperparameters settings. Our experimental results showed the effectiveness of ProxiMix from both fairness of predictions and fairness of recourse perspectives.
Paper Structure (18 sections, 7 equations, 5 figures, 11 tables, 1 algorithm)

This paper contains 18 sections, 7 equations, 5 figures, 11 tables, 1 algorithm.

Figures (5)

  • Figure 1: A comparison between proximity-based mixup and linear mixup. The red circle represents $S_0$, the blue triangle represents $S_1$, the purple diamonds represent the proximity set $D_p$, and the black square indicates the samples after mixing up. Here, we consider the particular case for case three where labels of most proximity samples are opposite to $S_1$. The mixing ratio is set to $0.5$.
  • Figure 2: An Example of ProxiMix with Balancing $d=[1,0.8,0.5,0.2,0], \lambda=0.5$
  • Figure 3: The Experiment Workflow
  • Figure 4: The fairness performance changes under different balancing degree $d$ in Credit Default dataset under MLP model (fTPR: TPR in female group, mTPF: TPR in male group, $d = [0, 0.2, 0.5, 0.7, 0.8, 1]$, refer to Appendix \ref{['app:credit_appendix']} for detailed results)
  • Figure 5: The fairness performance changes under different balancing degree $d$ in the Adult dataset under MLP model (fTPR: TPR in female group, mTPF: TPR in male group, d = [0, 0.2, 0.5, 0.7, 0.8, 1], refer to Appendix \ref{['ap:adult_appendix']} for detailed results)