On the Computational Tractability of the (Many) Shapley Values
Reda Marzouk, Shahaf Bassan, Guy Katz, Colin de la Higuera
TL;DR
This work provides a comprehensive computational-analytic view of SHAP variants beyond Conditional SHAP, across diverse model classes and distributions. It proves polynomial-time computability for local/global Interventional and Baseline SHAP under Hidden Markov Model distributions for Weighted Automata, Decision Trees, tree ensembles, and Linear Regression, while showing intractability (NP-hardness, co-NP-hardness, or #P-hardness) for several instances of Conditional SHAP and for SHAP in neural networks and hard-voting ensembles. The authors establish generalized complexity relations among SHAP variants, demonstrating clear gaps between Conditional SHAP and the Interventional/Baseline variants in many settings. They also provide extensive reductions showing how tractability results extend from WA to non-sequential models (DT, ENS-DT_R, LIN_R) and how empirical distributions relate to broader distribution families like HMMs. Overall, the paper deepens understanding of when SHAP computations are feasible and when they inherently resist exact computation, guiding both theory and practice in interpretable AI.
Abstract
Recent studies have examined the computational complexity of computing Shapley additive explanations (also known as SHAP) across various models and distributions, revealing their tractability or intractability in different settings. However, these studies primarily focused on a specific variant called Conditional SHAP, though many other variants exist and address different limitations. In this work, we analyze the complexity of computing a much broader range of such variants, including Conditional, Interventional, and Baseline SHAP, while exploring both local and global computations. We show that both local and global Interventional and Baseline SHAP can be computed in polynomial time for various ML models under Hidden Markov Model distributions, extending popular algorithms such as TreeSHAP beyond empirical distributions. On the downside, we prove intractability results for these variants over a wide range of neural networks and tree ensembles. We believe that our results emphasize the intricate diversity of computing Shapley values, demonstrating how their complexity is substantially shaped by both the specific SHAP variant, the model type, and the distribution.
