Table of Contents
Fetching ...

Practical do-Shapley Explanations with Estimand-Agnostic Causal Inference

Álvaro Parafita, Tomas Garriga, Axel Brando, Francisco J. Cazorla

TL;DR

This work tackles the challenge of explaining ML and data-driven decisions in the presence of causal structure by advancing do-SHAP explanations. It replaces the prior estimand-based estimation with an estimand-agnostic (EA) approach that learns a single structural causal model (SCM) of the observational distribution and uses general SCM procedures to estimate any identifiable interventional query, enabling scalable do-Shapley computations. To further accelerate explanations, the Frontier-Reducibility Algorithm (FRA) reduces the number of coalitions that must be evaluated by mapping them to irreducible subsets and caching results, delivering large speedups with negligible overhead. The authors also extend do-SHAP explanations to inaccessible DGPs via noise-attribution results and demonstrate the method on synthetic and real datasets, highlighting improved reliability over non-causal SHAP variants. Overall, the paper provides a practical, model-agnostic pipeline for causal attributions with scalable computation and broad applicability to debugging, auditing, and trustworthy AI.

Abstract

Among explainability techniques, SHAP stands out as one of the most popular, but often overlooks the causal structure of the problem. In response, do-SHAP employs interventional queries, but its reliance on estimands hinders its practical application. To address this problem, we propose the use of estimand-agnostic approaches, which allow for the estimation of any identifiable query from a single model, making do-SHAP feasible on complex graphs. We also develop a novel algorithm to significantly accelerate its computation at a negligible cost, as well as a method to explain inaccessible Data Generating Processes. We demonstrate the estimation and computational performance of our approach, and validate it on two real-world datasets, highlighting its potential in obtaining reliable explanations.

Practical do-Shapley Explanations with Estimand-Agnostic Causal Inference

TL;DR

This work tackles the challenge of explaining ML and data-driven decisions in the presence of causal structure by advancing do-SHAP explanations. It replaces the prior estimand-based estimation with an estimand-agnostic (EA) approach that learns a single structural causal model (SCM) of the observational distribution and uses general SCM procedures to estimate any identifiable interventional query, enabling scalable do-Shapley computations. To further accelerate explanations, the Frontier-Reducibility Algorithm (FRA) reduces the number of coalitions that must be evaluated by mapping them to irreducible subsets and caching results, delivering large speedups with negligible overhead. The authors also extend do-SHAP explanations to inaccessible DGPs via noise-attribution results and demonstrate the method on synthetic and real datasets, highlighting improved reliability over non-causal SHAP variants. Overall, the paper provides a practical, model-agnostic pipeline for causal attributions with scalable computation and broad applicability to debugging, auditing, and trustworthy AI.

Abstract

Among explainability techniques, SHAP stands out as one of the most popular, but often overlooks the causal structure of the problem. In response, do-SHAP employs interventional queries, but its reliance on estimands hinders its practical application. To address this problem, we propose the use of estimand-agnostic approaches, which allow for the estimation of any identifiable query from a single model, making do-SHAP feasible on complex graphs. We also develop a novel algorithm to significantly accelerate its computation at a negligible cost, as well as a method to explain inaccessible Data Generating Processes. We demonstrate the estimation and computational performance of our approach, and validate it on two real-world datasets, highlighting its potential in obtaining reliable explanations.

Paper Structure

This paper contains 45 sections, 16 theorems, 25 equations, 16 figures, 1 table, 3 algorithms.

Key Result

Proposition 4.2

For any non-ancestor $X$ of $Y$, $\phi_X = 0$.

Figures (16)

  • Figure 1: Salary causal graph: Age (A), Education (E), Seniority (S) and Salary (Y).
  • Figure 2: FRA execution example. a) Causal Graph with nodes in alphabetical order representing the selected topological order. b) FRA execution steps, with $k$ representing the loop step (lines 5--6, 24), $X$ the current node (line 7), $\textbf{P}$ the potential frontier for $X$, and $\textbf{Z}$ storing the nodes to be removed from $\textbf{S}$. The result of this execution is the coalition reduction $\{A, C, E, F\} \rightarrow \{C, F\}$.
  • Figure 3: Semi-Markovian graph. The Markovian graph results from considering $U_{\{{X}, {B}\}}$ as measured.
  • Figure 4: Markovian case. Box-plots computed over $30$ realizations of the dataset. (a) Distribution adjustment score, log-likelihood (bigger is better). (b) SHAP estimation loss, $\mathcal{L}$ (lower is better). (c) Feature Importance (the closer to ground-truth, the better). Dashed horizontal line represents uniform importance $(\frac{1}{K})$. See \ref{['sec:appendix_synthetic']} for a bigger figure.
  • Figure 5: FRA experiments. (a) Ratio of computed coalitions after FRA. (b) FRA execution time per coalition. (c) do-SHAP execution time (logarithmic scale) without cache (baseline), with cache (cache) and with an FRA cache (FRA). Error bars at 2-sigma over $30$ replications.
  • ...and 11 more figures

Theorems & Definitions (37)

  • Proposition 4.2
  • Definition 4.4
  • Proposition 4.5
  • Remark 4.6
  • Theorem 4.7
  • Proposition 4.8
  • Theorem 4.9
  • Definition C.1
  • Definition C.2
  • Theorem C.3
  • ...and 27 more