Table of Contents
Fetching ...

Decomposing Discrimination: Causal Mediation Analysis for AI-Driven Credit Decisions

Duraimurugan Rajamanickam

Abstract

Statistical fairness metrics in AI-driven credit decisions conflate two causally distinct mechanisms: discrimination operating directly from a protected attribute to a credit outcome, and structural inequality propagating through legitimate financial features. We formalise this distinction using Pearl's framework of natural direct and indirect effects applied to the credit decision setting. Our primary theoretical contribution is an identification strategy for natural direct and indirect effects under treatment-induced confounding -- the prevalent setting in which protected attributes causally affect both financial mediators and the final decision, violating standard sequential ignorability. We show that interventional direct and indirect effects (IDE/IIE) are identified under the weaker Modified Sequential Ignorability assumption, and prove that IDE/IIE provide conservative bounds on the unidentified natural effects under monotone indirect treatment response. We propose a doubly-robust augmented inverse probability weighted (AIPW) estimator for IDE/IIE with semiparametric efficiency properties, implemented via cross-fitting. An E-value sensitivity analysis addresses residual confounding on the direct pathway. Empirical evaluation on 89,465 real HMDA conventional purchase mortgage applications from New York State (2022) demonstrates that approximately 77% of the observed 7.9 percentage-point racial denial disparity operates through financial mediators shaped by structural inequality, while the remaining 23% constitutes a conservative lower bound on direct discrimination. The open-source CausalFair Python package implements the full pipeline for deployment at resource-constrained financial institutions.

Decomposing Discrimination: Causal Mediation Analysis for AI-Driven Credit Decisions

Abstract

Statistical fairness metrics in AI-driven credit decisions conflate two causally distinct mechanisms: discrimination operating directly from a protected attribute to a credit outcome, and structural inequality propagating through legitimate financial features. We formalise this distinction using Pearl's framework of natural direct and indirect effects applied to the credit decision setting. Our primary theoretical contribution is an identification strategy for natural direct and indirect effects under treatment-induced confounding -- the prevalent setting in which protected attributes causally affect both financial mediators and the final decision, violating standard sequential ignorability. We show that interventional direct and indirect effects (IDE/IIE) are identified under the weaker Modified Sequential Ignorability assumption, and prove that IDE/IIE provide conservative bounds on the unidentified natural effects under monotone indirect treatment response. We propose a doubly-robust augmented inverse probability weighted (AIPW) estimator for IDE/IIE with semiparametric efficiency properties, implemented via cross-fitting. An E-value sensitivity analysis addresses residual confounding on the direct pathway. Empirical evaluation on 89,465 real HMDA conventional purchase mortgage applications from New York State (2022) demonstrates that approximately 77% of the observed 7.9 percentage-point racial denial disparity operates through financial mediators shaped by structural inequality, while the remaining 23% constitutes a conservative lower bound on direct discrimination. The open-source CausalFair Python package implements the full pipeline for deployment at resource-constrained financial institutions.

Paper Structure

This paper contains 39 sections, 5 theorems, 12 equations, 6 figures, 2 tables, 1 algorithm.

Key Result

Proposition 3.1

Under the credit DAG of Definition def:dag, si2 is violated whenever there exists at least one unmeasured variable $U$ such that $U \to M_i$ and $U \to Y$ are both edges in the DAG, and $U$ is not a function of $(A, W)$.

Figures (6)

  • Figure 1: The credit decision directed acyclic graph (DAG). The unmeasured confounder $U$ simultaneously affects both the financial mediators $M$ and the credit outcome $Y$, creating treatment-induced confounding. This invalidates Sequential Ignorability (SI.2), making natural direct and indirect effects non-identifiable from observational data alone. The red path ($A \to Y$ directly) captures potential direct discrimination; the purple path ($A \to M \to Y$) captures structural inequality propagated through financial features. Dashed border and dashed arrows denote unmeasured variables and paths.
  • Figure 2: Decomposition of the 7.9 pp racial denial gap (real HMDA data, NY 2022) into interventional direct and indirect effects. The IDE (1.9 pp, 23.4%) provides a lower bound on direct discrimination under Proposition \ref{['prop:bounds']}. The IIE (6.1 pp, 76.6%) captures structural inequality propagated through financial mediators; the largest paths are via DTI (2.4 pp), credit score (1.6 pp), income (1.4 pp), and LTV (0.7 pp). Path-specific IIEs allocated proportionally from product-of-coefficients estimates.
  • Figure 3: Causal decomposition of the racial denial disparity estimated from real HMDA data (New York State, 2022). The total effect of 7.94 pp decomposes into an interventional direct effect (IDE) of 1.86 pp (23.4%, corresponding to ECOA disparate treatment) and an interventional indirect effect (IIE) of 6.09 pp (76.6%, corresponding to ECOA disparate impact through financial mediators). Error bars show Wald 95% confidence intervals from 5-fold cross-fitted AIPW estimation.
  • Figure 4: Causal IDE/IIE decomposition (blue) versus SHAP attribution (red) for the real HMDA data. SHAP attributes only 0.8 pp to race directly---substantially below the causal $\hat{\mathrm{IDE}} = 1.9$ pp---because conditioning on mediators $M$ absorbs the indirect effect into mediator SHAP values. The causal decomposition correctly separates the 1.9 pp direct pathway ($\mathrm{IDE}$, a conservative lower bound on $\mathrm{NDE}$) from the 6.1 pp indirect pathway ($\mathrm{IIE}$). SHAP cannot distinguish legitimate risk signals from structural inequality in the mediator attributions.
  • Figure 5: E-value sensitivity curve for the interventional direct effect (IDE) estimate from real HMDA data. The solid curve traces the confounder associations required to explain away $\hat{\mathrm{IDE}} = 1.9$ pp; the dashed curve corresponds to the 95% CI lower bound. The E-value for the central estimate is 1.68: an unmeasured confounder on the $A \to Y$ direct path would need risk-ratio-scale associations $\geq 1.68$ with both race (given $W$) and denial (given $A, M, W$) to nullify the observed IDE. The smaller E-value compared to the indirect pathway reflects the primary finding that most disparity flows through structural channels rather than direct discrimination.
  • ...and 1 more figures

Theorems & Definitions (12)

  • Definition 3.1: Credit Decision DAG
  • Definition 3.2: Natural Direct and Indirect Effects
  • Definition 3.3: Interventional Direct and Indirect Effects
  • Proposition 3.1: Violation of SI in Credit Data
  • proof
  • Proposition 3.2: Identification of IDE/IIE under msi
  • proof
  • Proposition 3.3: Conservative Bounds on NDE/NIE
  • proof
  • Theorem 4.1: Semiparametric Efficiency and Double Robustness
  • ...and 2 more