Table of Contents
Fetching ...

Local Causal Discovery for Structural Evidence of Direct Discrimination

Jacqueline Maasch, Kyra Gan, Violet Chen, Agni Orfanoudaki, Nil-Jana Akpinar, Fei Wang

TL;DR

The paper tackles the challenge of identifying direct discrimination in complex domains without full causal graphs by introducing LD3, a local causal discovery method that targets the causal parents of the outcome and yields a valid adjustment set for the weighted controlled direct effect. It embeds LD3 within a CFA framework using a graphical interpretation of causal partitions and proves asymptotic correctness under mild assumptions, while maintaining computational efficiency with $O(|oldsymbol{Z}|)$ CI tests. The authors introduce a graphical criterion for WCDE, enabling direct discrimination assessment through the derived adjustment set $oldsymbol{A}_{ ext{DE}}$, and demonstrate that LD3 provides more stable and interpretable results than global baselines across synthetic benchmarks and two real-world cases (COMPAS recidivism and liver transplant allocation). The practical impact lies in offering a scalable, interpretable, and statistically sound tool for policy analysis and algorithmic fairness, enabling targeted interventions based on the detected direct mechanisms of unfairness.

Abstract

Identifying the causal pathways of unfairness is a critical objective for improving policy design and algorithmic decision-making. Prior work in causal fairness analysis often requires knowledge of the causal graph, hindering practical applications in complex or low-knowledge domains. Moreover, global discovery methods that learn causal structure from data can display unstable performance on finite samples, preventing robust fairness conclusions. To mitigate these challenges, we introduce local discovery for direct discrimination (LD3): a method that uncovers structural evidence of direct unfairness by identifying the causal parents of an outcome variable. LD3 performs a linear number of conditional independence tests relative to variable set size, and allows for latent confounding under the sufficient condition that all parents of the outcome are observed. We show that LD3 returns a valid adjustment set (VAS) under a new graphical criterion for the weighted controlled direct effect, a qualitative indicator of direct discrimination. LD3 limits unnecessary adjustment, providing interpretable VAS for assessing unfairness. We use LD3 to analyze causal fairness in two complex decision systems: criminal recidivism prediction and liver transplant allocation. LD3 was more time-efficient and returned more plausible results on real-world data than baselines, which took 46$\times$ to 5870$\times$ longer to execute.

Local Causal Discovery for Structural Evidence of Direct Discrimination

TL;DR

The paper tackles the challenge of identifying direct discrimination in complex domains without full causal graphs by introducing LD3, a local causal discovery method that targets the causal parents of the outcome and yields a valid adjustment set for the weighted controlled direct effect. It embeds LD3 within a CFA framework using a graphical interpretation of causal partitions and proves asymptotic correctness under mild assumptions, while maintaining computational efficiency with CI tests. The authors introduce a graphical criterion for WCDE, enabling direct discrimination assessment through the derived adjustment set , and demonstrate that LD3 provides more stable and interpretable results than global baselines across synthetic benchmarks and two real-world cases (COMPAS recidivism and liver transplant allocation). The practical impact lies in offering a scalable, interpretable, and statistically sound tool for policy analysis and algorithmic fairness, enabling targeted interventions based on the detected direct mechanisms of unfairness.

Abstract

Identifying the causal pathways of unfairness is a critical objective for improving policy design and algorithmic decision-making. Prior work in causal fairness analysis often requires knowledge of the causal graph, hindering practical applications in complex or low-knowledge domains. Moreover, global discovery methods that learn causal structure from data can display unstable performance on finite samples, preventing robust fairness conclusions. To mitigate these challenges, we introduce local discovery for direct discrimination (LD3): a method that uncovers structural evidence of direct unfairness by identifying the causal parents of an outcome variable. LD3 performs a linear number of conditional independence tests relative to variable set size, and allows for latent confounding under the sufficient condition that all parents of the outcome are observed. We show that LD3 returns a valid adjustment set (VAS) under a new graphical criterion for the weighted controlled direct effect, a qualitative indicator of direct discrimination. LD3 limits unnecessary adjustment, providing interpretable VAS for assessing unfairness. We use LD3 to analyze causal fairness in two complex decision systems: criminal recidivism prediction and liver transplant allocation. LD3 was more time-efficient and returned more plausible results on real-world data than baselines, which took 46 to 5870 longer to execute.
Paper Structure (61 sections, 3 theorems, 11 equations, 17 figures, 16 tables, 1 algorithm)

This paper contains 61 sections, 3 theorems, 11 equations, 17 figures, 16 tables, 1 algorithm.

Key Result

Theorem 1

Asymptotic guarantees on partitioning and SDC correctness hold under Assumptions assumption:y_no_desc and assumption:all_parents_y. Given WCDE identifiability by Equation eq:wcde (Remark remark:assumptions_confounding), assumption:y_no_desc and assumption:all_parents_y are also sufficient for VAS di

Figures (17)

  • Figure 1: The standard fairness model (SFM) is compactly represented as a local subgraph around protected attribute $X$ and outcome $Y$plecko_causal_2024. Variables that are irrelevant to CFA are abstracted away, leaving confounders ($\mathbf{C}$) and mediators ($\mathbf{M}$). Directed edges represent the existence of active directed paths, and bidirected edges denote potential confounding. This work aims to identify direct mechanisms of unfairness in a data-driven way.
  • Figure 2: LD3 assesses whether the edge $X \to Y$ exists. (A) Allowable partitions under \ref{['assumption:y_no_desc']} and \ref{['assumption:all_parents_y']}. (B) Parents of $Y$ returned by LD3. Nodes are partition sets or subsets. Partition interrelations and latent confounding are abstracted away. Bidirected edge $Y \leftrightarrow \mathbf{Z}_2$ signifies $\mathbf{Z}_{2 \notin de(Y)}$. Edges with $\cdots$ are paths of arbitrary length. Solid edges are adjacencies.
  • Figure 3: Baseline results for parent discovery in Sangiovese. Independence test count (Tests) is reported for constraint-based methods. Time is in seconds. Shaded regions denote 95% confidence intervals over ten replicates.
  • Figure 4: Predicted parent sets, SDC, and WCDE estimates for COMPAS. Exposure is race (R; red) and outcome is general recidivism risk decile score (DS; blue). Known parents of DS are in yellow. A = age; CD = charge degree; JF = juvenile felonies; JM = juvenile misdemeanors; PC = priors count; S = sex. WCDE is reported with $p$-value ($p$) and 95% confidence intervals in brackets. All methods used $\chi^2$ CI tests ($\alpha = 0.05$). Full results are in Figures \ref{['fig:compas_ld3']}–\ref{['fig:compas_ldecc']}.
  • Figure 5: Predicted parent sets, SDC, and WCDE estimates for STAR liver data. Exposure is patient sex (S; red) and outcome is receiving a liver (L; blue). Known parents of $L$ are in yellow. AE = active exception case; BT = recipient blood type; DX = diagnosis; ED = education; ET = ethnicity; EX = exception type; IA = initial age; IM = initial MELD; PM = payment method; RE = region; WE = weight. WCDE is reported with $p$-value ($p$) and 95% confidence intervals in brackets. All methods used $\chi^2$ CI tests ($\alpha = 0.01$). Additional results are in Tables \ref{['tab:liver']}–\ref{['tab:liver_ldecc']}.
  • ...and 12 more figures

Theorems & Definitions (19)

  • Definition 1: Structural causal model, bareinboim2022pearl
  • Definition 2: Standard fairness model, plecko_causal_2024
  • Definition 3: Structural direct criterion (SDC), plecko_causal_2024
  • Definition 4: CDE, pearl_interpretation_2014
  • Definition 5: Identifiability conditions of the CDE, vanderweele_controlled_2011
  • Remark 1: Assumptions on Confounding
  • Definition 6: WCDE, pearl2000models
  • Definition 7: VAS for the WCDE
  • Theorem 1
  • Theorem 2
  • ...and 9 more