Table of Contents
Fetching ...

Fairpriori: Improving Biased Subgroup Discovery for Deep Neural Network Fairness

Kacy Zhou, Jiawen Wen, Nan Yang, Dong Yuan, Qinghua Lu, Huaming Chen

TL;DR

Fairpriori tackles intersectional bias by unifying biased subgroup discovery with multiple fairness metrics through an Apriori-based engine. By embedding metric calculations directly into the frequent-itemset generation, it achieves fast, interpretable detection of subgroups that exhibit fairness disparities, outperforming state-of-the-art baselines in efficiency and output richness. The method is demonstrated on COMPAS and Diabetes datasets, highlighting scenarios where different metrics reveal distinct at-risk subgroups and showing substantial runtime gains over existing tools. Open-source tooling and clear usage guidelines support practical adoption for fairness testing and bias mitigation in deep neural networks.

Abstract

While deep learning has become a core functional module of most software systems, concerns regarding the fairness of ML predictions have emerged as a significant issue that affects prediction results due to discrimination. Intersectional bias, which disproportionately affects members of subgroups, is a prime example of this. For instance, a machine learning model might exhibit bias against darker-skinned women, while not showing bias against individuals with darker skin or women. This problem calls for effective fairness testing before the deployment of such deep learning models in real-world scenarios. However, research into detecting such bias is currently limited compared to research on individual and group fairness. Existing tools to investigate intersectional bias lack important features such as support for multiple fairness metrics, fast and efficient computation, and user-friendly interpretation. This paper introduces Fairpriori, a novel biased subgroup discovery method, which aims to address these limitations. Fairpriori incorporates the frequent itemset generation algorithm to facilitate effective and efficient investigation of intersectional bias by producing fast fairness metric calculations on subgroups of a dataset. Through comparison with the state-of-the-art methods (e.g., Themis, FairFictPlay, and TestSGD) under similar conditions, Fairpriori demonstrates superior effectiveness and efficiency when identifying intersectional bias. Specifically, Fairpriori is easier to use and interpret, supports a wider range of use cases by accommodating multiple fairness metrics, and exhibits higher efficiency in computing fairness metrics. These findings showcase Fairpriori's potential for effectively uncovering subgroups affected by intersectional bias, supported by its open-source tooling at https://anonymous.4open.science/r/Fairpriori-0320.

Fairpriori: Improving Biased Subgroup Discovery for Deep Neural Network Fairness

TL;DR

Fairpriori tackles intersectional bias by unifying biased subgroup discovery with multiple fairness metrics through an Apriori-based engine. By embedding metric calculations directly into the frequent-itemset generation, it achieves fast, interpretable detection of subgroups that exhibit fairness disparities, outperforming state-of-the-art baselines in efficiency and output richness. The method is demonstrated on COMPAS and Diabetes datasets, highlighting scenarios where different metrics reveal distinct at-risk subgroups and showing substantial runtime gains over existing tools. Open-source tooling and clear usage guidelines support practical adoption for fairness testing and bias mitigation in deep neural networks.

Abstract

While deep learning has become a core functional module of most software systems, concerns regarding the fairness of ML predictions have emerged as a significant issue that affects prediction results due to discrimination. Intersectional bias, which disproportionately affects members of subgroups, is a prime example of this. For instance, a machine learning model might exhibit bias against darker-skinned women, while not showing bias against individuals with darker skin or women. This problem calls for effective fairness testing before the deployment of such deep learning models in real-world scenarios. However, research into detecting such bias is currently limited compared to research on individual and group fairness. Existing tools to investigate intersectional bias lack important features such as support for multiple fairness metrics, fast and efficient computation, and user-friendly interpretation. This paper introduces Fairpriori, a novel biased subgroup discovery method, which aims to address these limitations. Fairpriori incorporates the frequent itemset generation algorithm to facilitate effective and efficient investigation of intersectional bias by producing fast fairness metric calculations on subgroups of a dataset. Through comparison with the state-of-the-art methods (e.g., Themis, FairFictPlay, and TestSGD) under similar conditions, Fairpriori demonstrates superior effectiveness and efficiency when identifying intersectional bias. Specifically, Fairpriori is easier to use and interpret, supports a wider range of use cases by accommodating multiple fairness metrics, and exhibits higher efficiency in computing fairness metrics. These findings showcase Fairpriori's potential for effectively uncovering subgroups affected by intersectional bias, supported by its open-source tooling at https://anonymous.4open.science/r/Fairpriori-0320.
Paper Structure (25 sections, 5 figures, 8 tables, 1 algorithm)

This paper contains 25 sections, 5 figures, 8 tables, 1 algorithm.

Figures (5)

  • Figure 1: An Overview of Fairpriori
  • Figure 2: Straightforward flowchart of Fairpriori Application
  • Figure 3: Results comparing the effect of minimum support threshold and maximum subgroup length on time for the predictive equality fairness metric between Fairpriori and FairFictPlay using the Diabetes dataset.
  • Figure 4: Results comparing the effect of minimum support threshold and maximum subgroup length on time for the equalised opportunities fairness metric between Fairpriori and FairFictPlay using the Diabetes dataset.
  • Figure 5: Scatter plot comparing the effect of minimum support threshold and maximum subgroup length on time for the Diabetes dataset between Fairpriori and TestSGD. Results for TestSGD at a support of 10% are not shown as they took at least 45 minutes and would not be plottable.