Table of Contents
Fetching ...

Dancing in the Shadows: Harnessing Ambiguity for Fairer Classifiers

Ainhize Barrainkua, Paula Gordaliza, Jose A. Lozano, Novi Quadrianto

TL;DR

This paper proposes to leverage instances with uncertain identity with regards to the sensitive attribute to train a conventional machine learning classifier, highlighting the promising potential of prioritizing ambiguity as a means to improve fairness guarantees in real-world classification tasks.

Abstract

This paper introduces a novel approach to bolster algorithmic fairness in scenarios where sensitive information is only partially known. In particular, we propose to leverage instances with uncertain identity with regards to the sensitive attribute to train a conventional machine learning classifier. The enhanced fairness observed in the final predictions of this classifier highlights the promising potential of prioritizing ambiguity (i.e., non-normativity) as a means to improve fairness guarantees in real-world classification tasks.

Dancing in the Shadows: Harnessing Ambiguity for Fairer Classifiers

TL;DR

This paper proposes to leverage instances with uncertain identity with regards to the sensitive attribute to train a conventional machine learning classifier, highlighting the promising potential of prioritizing ambiguity as a means to improve fairness guarantees in real-world classification tasks.

Abstract

This paper introduces a novel approach to bolster algorithmic fairness in scenarios where sensitive information is only partially known. In particular, we propose to leverage instances with uncertain identity with regards to the sensitive attribute to train a conventional machine learning classifier. The enhanced fairness observed in the final predictions of this classifier highlights the promising potential of prioritizing ambiguity (i.e., non-normativity) as a means to improve fairness guarantees in real-world classification tasks.
Paper Structure (15 sections, 2 equations, 3 figures, 2 tables)

This paper contains 15 sections, 2 equations, 3 figures, 2 tables.

Figures (3)

  • Figure 1: Average results in the German Credit dataset for varying uncertainty threshold values, using different base learners. The first two rows depict fairness guarantees in terms of EOp and DP, respectively, with the shaded region indicating variance. The last row illustrates the results in the joint space of fairness(EOp) and accuracy. Our method outperforms the SOTA interventions in terms of fairness across all ML classifiers. Specifically, SVM and LGBM offer the most optimal fairness guarantees, almost reaching perfection, with minimal compromise in accuracy. Moreover, SVM shows the best fairness-accuracy trade-off curve.
  • Figure 2: Average results in the Adult Income dataset for varying uncertainty threshold values, using different base learners. The first two rows depict fairness guarantees in terms of EOp and DP, respectively, with the shaded region indicating variance. The last row illustrates the results in the joint space of fairness(EOp) and accuracy. Our method outperforms the SOTA pre- and post-processing interventions in terms of fairness guarantees across all considered ML classifiers. Specifically, employing SVM and LGBM yields the most favorable fairness guarantees, with a similar decrease in accuracy.
  • Figure 3: Average results in the COMPAS dataset for varying uncertainty threshold values, using different base learners. The first two rows depict fairness guarantees in terms of EOp and DP, respectively, with the shaded region indicating variance. The last row illustrates the results in the joint space of fairness(EOp) and accuracy. Our approach, employing LR and LGBM, outperforms the pre-processing approach in terms of fairness. Further, using SVM enables to outperform both SOTA methods in terms of fairness, providing the best fairness guarantees for this classification task. However, this enhancement in fairness typically accompanies a decrease in accuracy. Notably, the reduction in accuracy is least pronounced when employing LR (a more vertical trade-off curve).