InstaSHAP: Interpretable Additive Models Explain Shapley Values Instantly
James Enouen, Yan Liu
TL;DR
InstaSHAP provides a unifying, GAM-based view of SHAP explanations, showing that SHAP’s limitations in speed and interaction representation stem from its alignment with limited functional ANOVA spaces. By casting SHAP and its extensions within a variational GAM framework and introducing an automatic masking/purification objective, the authors enable instant, purified SHAP values via a forward pass. The work establishes theoretical correspondences between SHAP, GAM, and functional ANOVA under correlated inputs, and demonstrates practical benefits through synthetic and real-world tabular and high-dimensional experiments. The approach offers a principled means to assess SHAP trustworthiness and highlights the essential role of modeling feature interactions, especially in domains with strong input correlations such as CV and NLP.
Abstract
In recent years, the Shapley value and SHAP explanations have emerged as one of the most dominant paradigms for providing post-hoc explanations of black-box models. Despite their well-founded theoretical properties, many recent works have focused on the limitations in both their computational efficiency and their representation power. The underlying connection with additive models, however, is left critically under-emphasized in the current literature. In this work, we find that a variational perspective linking GAM models and SHAP explanations is able to provide deep insights into nearly all recent developments. In light of this connection, we borrow in the other direction to develop a new method to train interpretable GAM models which are automatically purified to compute the Shapley value in a single forward pass. Finally, we provide theoretical results showing the limited representation power of GAM models is the same Achilles' heel existing in SHAP and discuss the implications for SHAP's modern usage in CV and NLP.
