Table of Contents
Fetching ...

InstaSHAP: Interpretable Additive Models Explain Shapley Values Instantly

James Enouen, Yan Liu

TL;DR

InstaSHAP provides a unifying, GAM-based view of SHAP explanations, showing that SHAP’s limitations in speed and interaction representation stem from its alignment with limited functional ANOVA spaces. By casting SHAP and its extensions within a variational GAM framework and introducing an automatic masking/purification objective, the authors enable instant, purified SHAP values via a forward pass. The work establishes theoretical correspondences between SHAP, GAM, and functional ANOVA under correlated inputs, and demonstrates practical benefits through synthetic and real-world tabular and high-dimensional experiments. The approach offers a principled means to assess SHAP trustworthiness and highlights the essential role of modeling feature interactions, especially in domains with strong input correlations such as CV and NLP.

Abstract

In recent years, the Shapley value and SHAP explanations have emerged as one of the most dominant paradigms for providing post-hoc explanations of black-box models. Despite their well-founded theoretical properties, many recent works have focused on the limitations in both their computational efficiency and their representation power. The underlying connection with additive models, however, is left critically under-emphasized in the current literature. In this work, we find that a variational perspective linking GAM models and SHAP explanations is able to provide deep insights into nearly all recent developments. In light of this connection, we borrow in the other direction to develop a new method to train interpretable GAM models which are automatically purified to compute the Shapley value in a single forward pass. Finally, we provide theoretical results showing the limited representation power of GAM models is the same Achilles' heel existing in SHAP and discuss the implications for SHAP's modern usage in CV and NLP.

InstaSHAP: Interpretable Additive Models Explain Shapley Values Instantly

TL;DR

InstaSHAP provides a unifying, GAM-based view of SHAP explanations, showing that SHAP’s limitations in speed and interaction representation stem from its alignment with limited functional ANOVA spaces. By casting SHAP and its extensions within a variational GAM framework and introducing an automatic masking/purification objective, the authors enable instant, purified SHAP values via a forward pass. The work establishes theoretical correspondences between SHAP, GAM, and functional ANOVA under correlated inputs, and demonstrates practical benefits through synthetic and real-world tabular and high-dimensional experiments. The approach offers a principled means to assess SHAP trustworthiness and highlights the essential role of modeling feature interactions, especially in domains with strong input correlations such as CV and NLP.

Abstract

In recent years, the Shapley value and SHAP explanations have emerged as one of the most dominant paradigms for providing post-hoc explanations of black-box models. Despite their well-founded theoretical properties, many recent works have focused on the limitations in both their computational efficiency and their representation power. The underlying connection with additive models, however, is left critically under-emphasized in the current literature. In this work, we find that a variational perspective linking GAM models and SHAP explanations is able to provide deep insights into nearly all recent developments. In light of this connection, we borrow in the other direction to develop a new method to train interpretable GAM models which are automatically purified to compute the Shapley value in a single forward pass. Finally, we provide theoretical results showing the limited representation power of GAM models is the same Achilles' heel existing in SHAP and discuss the implications for SHAP's modern usage in CV and NLP.

Paper Structure

This paper contains 62 sections, 6 theorems, 85 equations, 22 figures, 10 tables.

Key Result

Theorem 1

(SHAP$\cong$ANOVA-1) SHAP will succeed on any hypothesis for some hypothesis space ${\mathcal{H}}$ if and only if ${\mathcal{H}}$ is completely free of feature interactions (${\mathcal{H}} \subseteq {\mathcal{H}}_{\text{ANOVA}}^{\leq 1}$).

Figures (22)

  • Figure 1: The fundamental correspondence between SHAP and GAM is used practically to distinguish two unique scenarios. In scenario A, such as simpler tabular data, GAM models can achieve SOTA performance and their SHAP explanations align with SHAP explanations of blackbox models. In modern ML applications like computer vision, we have scenario B, where there is a gap between GAM and DNN performance in practice. This means that either: (#1) we cannot train GAMs as well as other deep neural networks; or (#2) there is an overcredulous trust of SHAP in these domains.
  • Figure 2: Simple examples (Gaussian input variables and multilinear response variables) which demonstrate each of the two major types of feature interactions: synergistic interactions and redundant interactions. Their full functional ANOVA and exact Shapley functions are additionally calculated and shown. Colored by relevant variable. Note we use $x$ and $y$ instead of $x_1$ and $x_2$.
  • Figure 3: MSE error of approximations of the model's SHAP values. Since both FastSHAP and InstaSHAP are functional approximations, we report the MSE errors across the epochs of training.
  • Figure 4: The Spectrum of Interpretability to Uninterpretability. We display the key {hour, workday} interaction for the interpretable GAM, explainable SHAP, and uninterpretable blackbox.
  • Figure 5: 1D and 2D Dependence of Tree Species on Altitude and Soil
  • ...and 17 more figures

Theorems & Definitions (10)

  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Theorem 4
  • Claim 1
  • proof
  • Theorem 5
  • proof
  • Theorem 6
  • proof