Table of Contents
Fetching ...

BiasGuard: Guardrailing Fairness in Machine Learning Production Systems

Nurit Cohen-Inger, Seffi Cohen, Neomi Rabaev, Lior Rokach, Bracha Shapira

TL;DR

BiasGuard tackles fairness in production ML by introducing a post-processing guardrail that uses Test-Time Augmentation with CTGAN to generate synthetic samples conditioned on inverse protected attributes. The method detects potential bias at inference by comparing original and opposite-protected predictions and, when needed, augments with $\mathcal{T}$ synthetic samples to balance outcomes, aggregating predictions to reduce disparities. Across five datasets, BiasGuard delivers a substantial $EOD$ improvement (≈31%) with only a minor average accuracy loss (≈0.09%), outperforming Threshold Optimizer and Reject Option in fairness with less detrimental accuracy trade-offs. This approach provides a model-agnostic, deployment-friendly mechanism to safeguard fairness in dynamic production settings without retraining, though it introduces inference-time overhead that is mitigated by configurable augmentation levels and hardware acceleration.

Abstract

As machine learning (ML) systems increasingly impact critical sectors such as hiring, financial risk assessments, and criminal justice, the imperative to ensure fairness has intensified due to potential negative implications. While much ML fairness research has focused on enhancing training data and processes, addressing the outputs of already deployed systems has received less attention. This paper introduces 'BiasGuard', a novel approach designed to act as a fairness guardrail in production ML systems. BiasGuard leverages Test-Time Augmentation (TTA) powered by Conditional Generative Adversarial Network (CTGAN), a cutting-edge generative AI model, to synthesize data samples conditioned on inverted protected attribute values, thereby promoting equitable outcomes across diverse groups. This method aims to provide equal opportunities for both privileged and unprivileged groups while significantly enhancing the fairness metrics of deployed systems without the need for retraining. Our comprehensive experimental analysis across diverse datasets reveals that BiasGuard enhances fairness by 31% while only reducing accuracy by 0.09% compared to non-mitigated benchmarks. Additionally, BiasGuard outperforms existing post-processing methods in improving fairness, positioning it as an effective tool to safeguard against biases when retraining the model is impractical.

BiasGuard: Guardrailing Fairness in Machine Learning Production Systems

TL;DR

BiasGuard tackles fairness in production ML by introducing a post-processing guardrail that uses Test-Time Augmentation with CTGAN to generate synthetic samples conditioned on inverse protected attributes. The method detects potential bias at inference by comparing original and opposite-protected predictions and, when needed, augments with synthetic samples to balance outcomes, aggregating predictions to reduce disparities. Across five datasets, BiasGuard delivers a substantial improvement (≈31%) with only a minor average accuracy loss (≈0.09%), outperforming Threshold Optimizer and Reject Option in fairness with less detrimental accuracy trade-offs. This approach provides a model-agnostic, deployment-friendly mechanism to safeguard fairness in dynamic production settings without retraining, though it introduces inference-time overhead that is mitigated by configurable augmentation levels and hardware acceleration.

Abstract

As machine learning (ML) systems increasingly impact critical sectors such as hiring, financial risk assessments, and criminal justice, the imperative to ensure fairness has intensified due to potential negative implications. While much ML fairness research has focused on enhancing training data and processes, addressing the outputs of already deployed systems has received less attention. This paper introduces 'BiasGuard', a novel approach designed to act as a fairness guardrail in production ML systems. BiasGuard leverages Test-Time Augmentation (TTA) powered by Conditional Generative Adversarial Network (CTGAN), a cutting-edge generative AI model, to synthesize data samples conditioned on inverted protected attribute values, thereby promoting equitable outcomes across diverse groups. This method aims to provide equal opportunities for both privileged and unprivileged groups while significantly enhancing the fairness metrics of deployed systems without the need for retraining. Our comprehensive experimental analysis across diverse datasets reveals that BiasGuard enhances fairness by 31% while only reducing accuracy by 0.09% compared to non-mitigated benchmarks. Additionally, BiasGuard outperforms existing post-processing methods in improving fairness, positioning it as an effective tool to safeguard against biases when retraining the model is impractical.
Paper Structure (29 sections, 3 equations, 3 figures, 2 tables, 1 algorithm)

This paper contains 29 sections, 3 equations, 3 figures, 2 tables, 1 algorithm.

Figures (3)

  • Figure 1: BiasGuard motivation - For every sample $x^{(i)}$ from the test set, synthetic data is generated with the opposite value of its protected attribute as a condition. Then, the prediction is balanced with the nearest samples.
  • Figure 2: An overview of the BiasGuard method. 1 - For every sample $x^{(i)}$ from the test set. 2 - A TTA of synthetic data based on CTGAN is chosen with the opposite protected value of $x^{(i)}$. 3 - TTA predictions are received from the black-box model. 4 - Generation of the final prediction $\hat{y}_{final}$ by aggregating the instance prediction with all the augmentation predictions.
  • Figure 3: Tradeoff between fairness (represented by EoD) and accuracy.