Towards Fair Machine Learning Software: Understanding and Addressing Model Bias Through Counterfactual Thinking

Zichong Wang; Yang Zhou; Israat Haque; David Lo; Wenbin Zhang

Towards Fair Machine Learning Software: Understanding and Addressing Model Bias Through Counterfactual Thinking

Zichong Wang, Yang Zhou, Israat Haque, David Lo, Wenbin Zhang

TL;DR

This work tackles fairness bugs in ML software by addressing bias at its source in the data. It introduces CFSA, a framework built around counterfactual thinking that derives a Counterfactual Bias List ($CBList$), balances biased representation, corrects labeling bias, and applies fair data synthesis, followed by an accuracy-driven model met by ensemble training. Key contributions include formalizing $CFTest$, $CDTest$, and $CBTest$, plus a data-synthesis approach that preserves class balance; empirically CFSA achieves favorable fairness-accuracy trade-offs across eight real-world datasets and ten uni-attribute tasks, outperforming state-of-the-art baselines in many cases. The approach enables model-agnostic bias mitigation with strong multi-attribute handling and a tunable weighting strategy for deployment contexts, signaling practical impact for fair ML software engineering and AI safety pipelines. Future work will extend CFSA to text and image domains and incorporate industry-scale datasets and evaluation tools.

Abstract

The increasing use of Machine Learning (ML) software can lead to unfair and unethical decisions, thus fairness bugs in software are becoming a growing concern. Addressing these fairness bugs often involves sacrificing ML performance, such as accuracy. To address this issue, we present a novel counterfactual approach that uses counterfactual thinking to tackle the root causes of bias in ML software. In addition, our approach combines models optimized for both performance and fairness, resulting in an optimal solution in both aspects. We conducted a thorough evaluation of our approach on 10 benchmark tasks using a combination of 5 performance metrics, 3 fairness metrics, and 15 measurement scenarios, all applied to 8 real-world datasets. The conducted extensive evaluations show that the proposed method significantly improves the fairness of ML software while maintaining competitive performance, outperforming state-of-the-art solutions in 84.6% of overall cases based on a recent benchmarking tool.

Towards Fair Machine Learning Software: Understanding and Addressing Model Bias Through Counterfactual Thinking

TL;DR

), balances biased representation, corrects labeling bias, and applies fair data synthesis, followed by an accuracy-driven model met by ensemble training. Key contributions include formalizing

, and

, plus a data-synthesis approach that preserves class balance; empirically CFSA achieves favorable fairness-accuracy trade-offs across eight real-world datasets and ten uni-attribute tasks, outperforming state-of-the-art baselines in many cases. The approach enables model-agnostic bias mitigation with strong multi-attribute handling and a tunable weighting strategy for deployment contexts, signaling practical impact for fair ML software engineering and AI safety pipelines. Future work will extend CFSA to text and image domains and incorporate industry-scale datasets and evaluation tools.

Abstract

Paper Structure (34 sections, 7 equations, 9 figures, 3 tables)

This paper contains 34 sections, 7 equations, 9 figures, 3 tables.

Introduction
Preliminaries and Background
Terminology
Root of Model Bias
Data Imbalance Bias
Labeling bias
Methodology
Briefly
Counterfactually Debiasing Biased Dataset
Counterfactual Bias List
Balancing Biased Representation
Correcting Labeling Bias
Fair Synthesis
Accuracy-Driven Training
Ensemble Training
...and 19 more sections

Figures (9)

Figure 1: All datasets exhibit an imbalanced distribution concerning the sensitive attribute and class label.
Figure 2: The overall framework of CFSA: (a) Biased dataset; (b) Counterfactual fairness test; (c) Debiased dataset; (d) Fairness-oriented training; (e) Performance-driven training; (f) Ensemble prediction.
Figure 3: Fairea's mitigation regions based on changes in performance and bias.
Figure 4: The statistical parity differences with and without biased sampls identified by CBList removed.
Figure 5: Proportion of cases where CFSA beats the baseline in different ML algorithms.
...and 4 more figures

Towards Fair Machine Learning Software: Understanding and Addressing Model Bias Through Counterfactual Thinking

TL;DR

Abstract

Towards Fair Machine Learning Software: Understanding and Addressing Model Bias Through Counterfactual Thinking

Authors

TL;DR

Abstract

Table of Contents

Figures (9)