Table of Contents
Fetching ...

A New Paradigm for Counterfactual Reasoning in Fairness and Recourse

Lucius E. J. Bynum, Joshua R. Loftus, Julia Stoyanovich

TL;DR

This work introduces a backtracking counterfactual paradigm for fairness and recourse that avoids intervening on legally protected attributes. It defines new notions of counterfactual opportunity and effort, along with corresponding individual- and group-level discrimination criteria, grounded in backtracking conditional distributions $P_B(U^* vert U)$ and an opportunity set $S$. An algorithm for sampling backtracking counterfactuals is provided and demonstrated on synthetic hiring data and a law school dataset, revealing that traditional balance assumptions and interventional notions may miss important fairness signals. The approach enables explanations of counterfactual outcomes while accommodating socially constructed categories and relaxing modularity assumptions, with practical implications for auditing and improving demographic data usage in AI systems.

Abstract

Counterfactuals and counterfactual reasoning underpin numerous techniques for auditing and understanding artificial intelligence (AI) systems. The traditional paradigm for counterfactual reasoning in this literature is the interventional counterfactual, where hypothetical interventions are imagined and simulated. For this reason, the starting point for causal reasoning about legal protections and demographic data in AI is an imagined intervention on a legally-protected characteristic, such as ethnicity, race, gender, disability, age, etc. We ask, for example, what would have happened had your race been different? An inherent limitation of this paradigm is that some demographic interventions -- like interventions on race -- may not translate into the formalisms of interventional counterfactuals. In this work, we explore a new paradigm based instead on the backtracking counterfactual, where rather than imagine hypothetical interventions on legally-protected characteristics, we imagine alternate initial conditions while holding these characteristics fixed. We ask instead, what would explain a counterfactual outcome for you as you actually are or could be? This alternate framework allows us to address many of the same social concerns, but to do so while asking fundamentally different questions that do not rely on demographic interventions.

A New Paradigm for Counterfactual Reasoning in Fairness and Recourse

TL;DR

This work introduces a backtracking counterfactual paradigm for fairness and recourse that avoids intervening on legally protected attributes. It defines new notions of counterfactual opportunity and effort, along with corresponding individual- and group-level discrimination criteria, grounded in backtracking conditional distributions and an opportunity set . An algorithm for sampling backtracking counterfactuals is provided and demonstrated on synthetic hiring data and a law school dataset, revealing that traditional balance assumptions and interventional notions may miss important fairness signals. The approach enables explanations of counterfactual outcomes while accommodating socially constructed categories and relaxing modularity assumptions, with practical implications for auditing and improving demographic data usage in AI systems.

Abstract

Counterfactuals and counterfactual reasoning underpin numerous techniques for auditing and understanding artificial intelligence (AI) systems. The traditional paradigm for counterfactual reasoning in this literature is the interventional counterfactual, where hypothetical interventions are imagined and simulated. For this reason, the starting point for causal reasoning about legal protections and demographic data in AI is an imagined intervention on a legally-protected characteristic, such as ethnicity, race, gender, disability, age, etc. We ask, for example, what would have happened had your race been different? An inherent limitation of this paradigm is that some demographic interventions -- like interventions on race -- may not translate into the formalisms of interventional counterfactuals. In this work, we explore a new paradigm based instead on the backtracking counterfactual, where rather than imagine hypothetical interventions on legally-protected characteristics, we imagine alternate initial conditions while holding these characteristics fixed. We ask instead, what would explain a counterfactual outcome for you as you actually are or could be? This alternate framework allows us to address many of the same social concerns, but to do so while asking fundamentally different questions that do not rely on demographic interventions.
Paper Structure (14 sections, 4 equations, 6 figures, 1 algorithm)

This paper contains 14 sections, 4 equations, 6 figures, 1 algorithm.

Figures (6)

  • Figure 1: A visual example of several backtracking counterfactual quantities to consider for algorithmic fairness and recourse (labeled on the right). Based on Example \ref{['ex:running_example']}, consider covariate $X \sim \mathcal{N}(0, 1)$, protected attribute $A \sim \text{Bern}(0.5)$, and predictor $\widehat{Y}=A \lor (X > 0)$. A corresponding DAG is shown in Figure \ref{['fig:dag_backtracking_cf']}. With backtracking conditional $P_B(U^* \mid U) = \left(U^*_A = U_A, U^*_X = \mathcal{N}(U_X, 1), U^*_Y = 0\right)$ and an appropriate cost function $\mathcal{L}$, we can use Algorithm \ref{['alg:backtracking_counterfactual_sampling']} to estimate the various notions of opportunity and effort shown above for both individuals and groups. This process is formally defined in Sections \ref{['sec:opportunity']} & \ref{['sec:effort']}.
  • Figure 2: Graphical models for two counterfactual fairness problems with protected attribute $A$, features $X$, and classification outcome $\widehat{Y}$. (a) Interventional approach with intervention $\text{do}(A=A^*)$. (b) Backtracking approach with counterfactual observation $\widehat{Y}^*$. Red nodes are known. Based on Figure 2 in von2023backtracking.
  • Figure 3: (a) Stylized causal model for a hiring example with age $A$, success measure $Y$, latent variables $Z_{A'} \perp A$ and $Z_A \not\perp A$, and qualifications $X_1, X_2$. (b) An alternate version of (a) where latent variable $Z_A$ also impacts $X_1$. (c) A possible DAG for the law school example, reproduced from Kusner2017CounterfactualF. (d) A backtracking counterfactual twin network version of the DAG in (a), now with a Level 3 ICF fair predictor $\widehat{Y}$ as the outcome instead of $Y$.
  • Figure 4: Visualization of how well Individual Equality of Counterfactual Opportunity is satisfied for models using different subsets of covariates across Scenarios \ref{['scenario:balanced']} and \ref{['scenario:unbalanced']}, calculated as the absolute energy distance MMD between the two terms of Definition \ref{['def:individual_equal_cf_opportunity']}.
  • Figure 5: Visualization of (a) Individual Counterfactual Effort Required (Definition \ref{['def:individual_cf_effort']}) and (b) Group-level Counterfactual Effort Required (Definition \ref{['def:group_cf_effort']}) for models using different subsets of covariates across Scenarios \ref{['scenario:balanced']} and \ref{['scenario:unbalanced']}, with energy distance MMD for cost function $\mathcal{L}$.
  • ...and 1 more figures

Theorems & Definitions (12)

  • Example 1: Discrimination in Hiring
  • Definition 1: Non-informative Backtracking Conditional Distribution
  • Definition 2: Opportunity Set
  • Definition 3: Individual-level Counterfactual Opportunity
  • Definition 4: Individual-level Realized Opportunity
  • Definition 5: Group-level Counterfactual Opportunity
  • Definition 6: Group-level Realized Opportunity
  • Definition 7: Individual Equality of Counterfactual Opportunity
  • Definition 8: Group-level Equality of Counterfactual Opportunity
  • Definition 9: Individual Counterfactual Effort Required
  • ...and 2 more