Counterfactual Situation Testing: Uncovering Discrimination under Fairness given the Difference

Jose M. Alvarez; Salvatore Ruggieri

Counterfactual Situation Testing: Uncovering Discrimination under Fairness given the Difference

Jose M. Alvarez, Salvatore Ruggieri

TL;DR

The paper addresses the challenge of detecting individual discrimination in classifiers, focusing on indirect discrimination under EU law. It introduces Counterfactual Situation Testing (CST), a framework that combines structural causal models with situation testing by constructing a test group around the complainant's counterfactual and a control group around the factual instance. CST provides three core contributions: (1) an actionable discrimination detection workflow, (2) an operationalization of fairness given the difference via mutatis mutandis counterfactuals, and (3) an uncertainty-enabled extension of counterfactual fairness with confidence intervals. Through synthetic and Law School Admissions experiments, CST detects more instances of discrimination than traditional situation testing and counterfactual fairness, and demonstrates practical utility across multiple protected attributes and parameter settings. This approach enhances regulatory relevance and provides a principled means to quantify and reason about discrimination with statistical certainty.

Abstract

We present counterfactual situation testing (CST), a causal data mining framework for detecting discrimination in classifiers. CST aims to answer in an actionable and meaningful way the intuitive question "what would have been the model outcome had the individual, or complainant, been of a different protected status?" It extends the legally-grounded situation testing of Thanh et al. (2011) by operationalizing the notion of fairness given the difference using counterfactual reasoning. For any complainant, we find and compare similar protected and non-protected instances in the dataset used by the classifier to construct a control and test group, where a difference between the decision outcomes of the two groups implies potential individual discrimination. Unlike situation testing, which builds both groups around the complainant, we build the test group on the complainant's counterfactual generated using causal knowledge. The counterfactual is intended to reflect how the protected attribute when changed affects the seemingly neutral attributes used by the classifier, which is taken for granted in many frameworks for discrimination. Under CST, we compare similar individuals within each group but dissimilar individuals across both groups due to the possible difference between the complainant and its counterfactual. Evaluating our framework on two classification scenarios, we show that it uncovers a greater number of cases than situation testing, even when the classifier satisfies the counterfactual fairness condition of Kusner et al. (2017).

Counterfactual Situation Testing: Uncovering Discrimination under Fairness given the Difference

TL;DR

Abstract

Paper Structure (19 sections, 1 theorem, 14 equations, 7 figures, 5 tables)

This paper contains 19 sections, 1 theorem, 14 equations, 7 figures, 5 tables.

Introduction
Related Work
Causal Knowledge for Discrimination
Structural Causal Models and Counterfactuals
Conceiving Discrimination
Fairness given the Difference: the Kohler-Hausmann Critique
Counterfactual Situation Testing
Building Control and Test Groups
Detecting Discrimination
Connection to Counterfactual Fairness
An Implementation: k-NN CST
Experiments
An Illustrative Example
Law School Admissions
Conclusion
...and 4 more sections

Key Result

proposition 1

Counterfactual fairness does not imply nor it is implied by Individual Discrimination (Def. def:IndDisc).

Figures (7)

Figure 1: The causal knowledge with corresponding SCM $\mathcal{M}$ and DAG $\mathcal{G}$ behind our (illustrative example) loan application dataset. Let $A$ denote an individual's gender, $X_1$ annual salary, $X_2$ bank balance, and $\widehat{Y}$ the loan decision based on the bank's ADM $b()$.
Figure 2: Left. Account balance ($X_2$) distribution for females in the factual $\mathcal{D}$ and counterfactual $\mathcal{D}^{CF}$ datasets. Right. A comparison on $X_2$ of the ST and CST (w/o) control group (ctr) versus the ST (tst-st) and CST (w/o) (tst-cf) test groups for five randomly chosen complainants detected by both methods, showing the fairness given the difference behind CST as tst-st is closer to ctr than tst-cf.
Figure 3: The causal knowledge with corresponding SCM $\mathcal{M}$ and DAG $\mathcal{G}$ behind the law school admissions dataset, with $R$ denoting race ($R=1$ for non-white) and $G$ denoting gender ($G=1$ for female).
Figure :
Figure :
...and 2 more figures

Theorems & Definitions (6)

definition 1: Search Spaces
definition 2: Counterfactual Dataset
definition 3: Search Centers
definition 4: Individual Discrimination
definition 5: Confidence on the Individual Discrimination Claim
proposition 1: On Actionable Counterfactual Fairness

Counterfactual Situation Testing: Uncovering Discrimination under Fairness given the Difference

TL;DR

Abstract

Counterfactual Situation Testing: Uncovering Discrimination under Fairness given the Difference

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (6)