Locating disparities in machine learning

Moritz von Zahn; Oliver Hinz; Stefan Feuerriegel

Locating disparities in machine learning

Moritz von Zahn, Oliver Hinz, Stefan Feuerriegel

TL;DR

This paper tackles the problem of locating disparate ML outcomes without requiring predefined sensitive attributes. It introduces Automatic Location of Disparities (ALD), a three-step framework that uses recursive partitioning to generate candidate subgroups, chi-square-based hypothesis testing to assess subgroup disparities, and audit reports with visualizations to guide investigations. ALD supports arbitrary classifiers and multiple notions of disparity (e.g., statistical parity, equalized odds) and handles both categorical and continuous predictors, including intersectional interactions. The method demonstrates superior performance on synthetic data and aligns with real-world domain knowledge on Adult Income and COMPAS datasets, offering a practical, open-source tool for algorithmic audits and fairness mitigation while highlighting limitations around causality and observed attributes. Overall, ALD provides a principled, scalable approach to identifying and prioritizing disparity-inducing subgroups to support compliant, equitable ML deployment.

Abstract

Machine learning can provide predictions with disparate outcomes, in which subgroups of the population (e.g., defined by age, gender, or other sensitive attributes) are systematically disadvantaged. In order to comply with upcoming legislation, practitioners need to locate such disparate outcomes. However, previous literature typically detects disparities through statistical procedures for when the sensitive attribute is specified a priori. This limits applicability in real-world settings where datasets are high dimensional and, on top of that, sensitive attributes may be unknown. As a remedy, we propose a data-driven framework called Automatic Location of Disparities (ALD) which aims at locating disparities in machine learning. ALD meets several demands from industry: ALD (1) is applicable to arbitrary machine learning classifiers; (2) operates on different definitions of disparities (e.g., statistical parity or equalized odds); and (3) deals with both categorical and continuous predictors even if disparities arise from complex and multi-way interactions known as intersectionality (e. g., age above 60 and female). ALD produces interpretable audit reports as output. We demonstrate the effectiveness of ALD based on both synthetic and real-world datasets. As a result, we empower practitioners to effectively locate and mitigate disparities in machine learning algorithms, conduct algorithmic audits, and protect individuals from discrimination.

Locating disparities in machine learning

TL;DR

Abstract

Paper Structure (26 sections, 12 equations, 5 figures, 2 tables)

This paper contains 26 sections, 12 equations, 5 figures, 2 tables.

Introduction
Related work
Notions of fairness in ML
Metrics for measuring disparities in ML
Tools for locating disparities in ML
Framework for automatic location of disparities
Task description
Overview of ALD framework
Recursive partitioning for subgroup generation (Step 1)
Statistical hypothesis testing for subgroup-specific disparity assessment (Step 2)
Audit report generation (Step 3)
Implementation
Experimental setup
Baselines
Synthetic datasets
...and 11 more sections

Figures (5)

Figure 1: Example of a visualization provided by the audit report in ALD.
Figure 2: Performance is measured with respect to the rate (in %) of how often the attribute causing the disparities is correctly located. Baselines are multivariate subset scan Zhang.2016 and parameter instability tree Chouldechova.2017. Performance is averaged across 100 random draws of the synthetic datasets. In the left plot, we vary the induced disparity $\rho$ for a fixed width $w = 24 \text{ years}$ ($\frac{1}{3}$ of the total age span). In the center plot, we fix $\rho = 0.2$ and vary $w$. For the synthetic dataset 2 on the right, we vary the induced disparity $\rho$.
Figure 3: Probability and observed frequencies of outcome $y$ for different levels of age in a random draw of synthetic dataset 1 (light gray represents individuals with $y=0$, dark gray with $y=1$).
Figure 4: Probability and observed frequencies of outcome $y$ in synthetic dataset 2 with "hidden" intersectionality, i. e., where disparate outcomes are only induced for the combination of race and gender.
Figure 5: Performance is measured with respect to the rate (in %) of how often the attribute causing the disparities is correctly located. Here, we compare ALD with different tree-based algorithms. Results are based on 100 random draws of synthetic datasets 1 (left, center) and 2 (right). On the left, we vary the induced disparity $\rho$ for a fixed width $w = 24 \text{ years}$ ($\frac{1}{3}$ of the total age span). In the center, we fix $\rho = 0.2$ and vary $w$. For the synthetic dataset 2 on the right, we vary the induced disparity $\rho$.

Locating disparities in machine learning

TL;DR

Abstract

Locating disparities in machine learning

Authors

TL;DR

Abstract

Table of Contents

Figures (5)