Exactly Computing do-Shapley Values

R. Teal Witter; Álvaro Parafita; Tomas Garriga; Maximilian Muschalik; Fabian Fumagalli; Axel Brando; Lucas Rosenblatt

Exactly Computing do-Shapley Values

R. Teal Witter, Álvaro Parafita, Tomas Garriga, Maximilian Muschalik, Fabian Fumagalli, Axel Brando, Lucas Rosenblatt

TL;DR

This work tackles the computational bottleneck of do-Shapley value estimation in Structural Causal Models by reframing the problem in terms of irreducible equivalence classes of coalitions defined by basis and closure. It delivers an exact algorithm with time complexity linear in the number of irreducible sets r, which often scales between d and far below 2^d in real graphs, and introduces a structure-aware boundary sampling scheme that achieves near-optimal accuracy under fixed query budgets while converging to exact values when m ≥ r. A linear-time identifiability check reduces practical risk by ensuring identifiability of all class queries from singleton identifiability. Empirical results on the TALENT benchmark demonstrate substantial speedups and accuracy gains over structure-agnostic baselines, highlighting the practical impact for scalable causal explainability in complex systems. The framework also generalizes to other value notions and interaction indices, broadening its applicability to diverse causal attribution tasks.

Abstract

Structural Causal Models (SCM) are a powerful framework for describing complicated dynamics across the natural sciences. A particularly elegant way of interpreting SCMs is do-Shapley, a game-theoretic method of quantifying the average effect of $d$ variables across exponentially many interventions. Like Shapley values, computing do-Shapley values generally requires evaluating exponentially many terms. The foundation of our work is a reformulation of do-Shapley values in terms of the irreducible sets of the underlying SCM. Leveraging this insight, we can exactly compute do-Shapley values in time linear in the number of irreducible sets $r$, which itself can range from $d$ to $2^d$ depending on the graph structure of the SCM. Since $r$ is unknown a priori, we complement the exact algorithm with an estimator that, like general Shapley value estimators, can be run with any query budget. As the query budget approaches $r$, our estimators can produce more accurate estimates than prior methods by several orders of magnitude, and, when the budget reaches $r$, return the Shapley values up to machine precision. Beyond computational speed, we also reduce the identification burden: we prove that non-parametric identifiability of do-Shapley values requires only the identification of interventional effects for the $d$ singleton coalitions, rather than all classes.

Exactly Computing do-Shapley Values

TL;DR

Abstract

variables across exponentially many interventions. Like Shapley values, computing do-Shapley values generally requires evaluating exponentially many terms. The foundation of our work is a reformulation of do-Shapley values in terms of the irreducible sets of the underlying SCM. Leveraging this insight, we can exactly compute do-Shapley values in time linear in the number of irreducible sets

, which itself can range from

depending on the graph structure of the SCM. Since

is unknown a priori, we complement the exact algorithm with an estimator that, like general Shapley value estimators, can be run with any query budget. As the query budget approaches

, our estimators can produce more accurate estimates than prior methods by several orders of magnitude, and, when the budget reaches

, return the Shapley values up to machine precision. Beyond computational speed, we also reduce the identification burden: we prove that non-parametric identifiability of do-Shapley values requires only the identification of interventional effects for the

singleton coalitions, rather than all classes.

Paper Structure (25 sections, 17 theorems, 19 equations, 13 figures, 5 tables, 9 algorithms)

This paper contains 25 sections, 17 theorems, 19 equations, 13 figures, 5 tables, 9 algorithms.

Introduction
Additional Related Work
Reformulating do-Shapley Values
Exactly Computing do-Shapley Values
Simple Class Optimization
Approximating do-Shapley Values
Boundary Sampling
Estimation via Simulation
Identifiability
Experiments
Lattice Complexity Reduction
Generalizations
Delayed Proofs
Background on Structural Causal Models
Simulated Sampling from Irreducible Sets
...and 10 more sections

Key Result

Lemma 3.1

Let $\bar{S} \subset [d]$ be a closed set with basis $\underline{S}$. Then

Figures (13)

Figure 1: An example Structural Causal Model (SCM) and the corresponding lattice of coalitions. Because of the graph structure, intervening on some nodes is redundant. For example, setting $\{X_1, X_2, X_3, X_4\}$ has the same effect as setting $\{X_3\}$ because $X_3$ blocks all directed paths from the other nodes to $Y$. For such a class, we refer to its smallest coalition (e.g., $\{X_3\}$) as the basis, and the largest coalition (e.g., $\{X_1, X_2,X_3,X_4\}$) as the closure.
Figure 2: Left: A learned SCM from a TALENT dataset. Nodes and edges represent the learned causal graph used to define the interventional value function $\nu(S)=\mathbb{E}[Y\mid \mathrm{do}(S=\mathbf{x}_S)]$ for a fixed instance $\mathbf{x}$. Right: Plots of estimated vs true do-Shapley values on the learned SCM for randomly sampled $\mathbf{x}$. Compared to the value-function-agnostic state-of-the-art RegressionMSR and LeverageSHAP estimators, our doEstimator variants provide substantially more accurate do-Shapley value estimates.
Figure 3: The number of irreducible sets ranges between $d$ and $2^d$.
Figure 4: Complexity Reduction. The number of irreducible sets $r$ (colored points) versus the dimension $d$. While the theoretical worst-case complexity is $2^d$ (red dashed line), real-world causal structures are often sparse, resulting in $r$ scaling in between the exponential and the linear lower bound $d$ (black dotted line).
Figure 5: Estimator Convergence (Aggregated). The relative MSE of Shapley value estimates versus the budget ratio $m/r$, aggregated across all datasets. Shaded regions indicate 95% confidence intervals. Our structure-aware estimators consistently outperform the baseline variants. Notably, as the budget exceeds the number of classes ($m > r$, red dotted line), our error vanishes to machine precision, whereas baselines continue to exhibit variance.
...and 8 more figures

Theorems & Definitions (36)

Definition 2.1: Basis
Definition 2.2: Closure
Lemma 3.1
Proposition 3.2
Proposition 4.1
Theorem 5.1
proof : Proof of Lemma \ref{['lemma:closed_plus']}
proof : Proof of Proposition \ref{['prop:boundary_sampler_time']}
Proposition 3.1
proof
...and 26 more

Exactly Computing do-Shapley Values

TL;DR

Abstract

Exactly Computing do-Shapley Values

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (13)

Theorems & Definitions (36)