Stabilizing Estimates of Shapley Values with Control Variates

Jeremy Goldwasser; Giles Hooker

Stabilizing Estimates of Shapley Values with Control Variates

Jeremy Goldwasser, Giles Hooker

TL;DR

This work tackles the instability of Shapley-value explanations caused by Monte Carlo sampling by introducing ControlSHAP, a general, model-agnostic variance-reduction technique based on control variates. It leverages a correlated, tractable Taylor-approximation of the model to form a control variate whose optimal coefficient minimizes variance, yielding substantial reductions in Shapley-value variability (up to $90\%$ in some cases) with minimal extra computation. The approach applies to both independent and correlated feature settings and to differentiable or non-differentiable models (via finite-difference gradients), and it can be combined with Shapley Sampling or KernelSHAP. Empirical results on five high-dimensional datasets show improved stability in Shapley estimates and rankings, along with the ability to estimate anticipated variance reductions from observed correlations, enabling faster convergence and more trustworthy explanations in practice.

Abstract

Shapley values are among the most popular tools for explaining predictions of blackbox machine learning models. However, their high computational cost motivates the use of sampling approximations, inducing a considerable degree of uncertainty. To stabilize these model explanations, we propose ControlSHAP, an approach based on the Monte Carlo technique of control variates. Our methodology is applicable to any machine learning model and requires virtually no extra computation or modeling effort. On several high-dimensional datasets, we find it can produce dramatic reductions in the Monte Carlo variability of Shapley estimates.

Stabilizing Estimates of Shapley Values with Control Variates

TL;DR

in some cases) with minimal extra computation. The approach applies to both independent and correlated feature settings and to differentiable or non-differentiable models (via finite-difference gradients), and it can be combined with Shapley Sampling or KernelSHAP. Empirical results on five high-dimensional datasets show improved stability in Shapley estimates and rankings, along with the ability to estimate anticipated variance reductions from observed correlations, enabling faster convergence and more trustworthy explanations in practice.

Abstract

Paper Structure (22 sections, 3 theorems, 26 equations, 10 figures, 1 table)

This paper contains 22 sections, 3 theorems, 26 equations, 10 figures, 1 table.

Introduction
Shapley Values in Machine Learning
Shapley Values
Shapley Estimation
Related Work
Control Variates for Shapley Values
Control Variates
ControlSHAP
Independent Features
Correlated Features
Non-Differentiable Models
Variance Estimation
Shapley Sampling
KernelSHAP
Experiments
...and 7 more sections

Key Result

lemma thmcounterlemma

Suppose $\hat{A}, \hat{B}$ are unbiased estimators and $\rho_{\hat{A}, \hat{B}} > 0$, where $\rho$ is Pearson's correlation coefficient. Then the control variates estimator (basic CV) has minimal variance $\text{Var}(\tilde{A}) = (1 - \rho^2_{\hat{A}, \hat{B}}) \text{Var}(\hat{A})$ when $c^* = \frac

Figures (10)

Figure 1: Variance (or MSE) of Shapley estimates. Computed independent-features Shapley sampling estimates on "Prev Days" feature, using logistic regression model on Bank dataset. Bands show inner quartiles across 50 iterations.
Figure 2: Variance reductions of ControlSHAP methods. Sorted features by absolute Shapley values (unscaled in plot) with logistic regression predictor. Confidence bands denote lower and upper quartiles of Eq. \ref{['eqn:var reduc']} across 40 held-out values of $x$.
Figure 3: Average number of Ranking Changes (Eq. \ref{['eqn: rank chgs']}) across 50 iterations, with and without ControlSHAP correction. Shapley values estimated on 40 inputs via KernelSHAP with correlated features. Neural network trained on Credit dataset.
Figure 4: Variance Reductions of Random Forest on simulated dataset with 10 features. Upper and lower quantiles of reductions provided across 40 inputs. Computed gradients via finite differencing.
Figure 5: Normalized absolute difference from sum of Shapely estimates to $f(x)-Ef(X)$. Ran for 10 iterations on credit dataset with independent features.
...and 5 more figures

Theorems & Definitions (4)

lemma thmcounterlemma
theorem thmcountertheorem
theorem thmcountertheorem
proof

Stabilizing Estimates of Shapley Values with Control Variates

TL;DR

Abstract

Stabilizing Estimates of Shapley Values with Control Variates

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (10)

Theorems & Definitions (4)