Improving the Weighting Strategy in KernelSHAP
Lars Henry Berge Olsen, Martin Jullum
TL;DR
The paper tackles the high computational cost of conditional Shapley-value explanations with KernelSHAP by introducing deterministic weighting via paired c-kernel and by refining the PySHAP approach (PySHAP*), including PySHAP* c-kernel. It presents a detailed taxonomy of sampling and weighting strategies, derives deterministic weight corrections, and demonstrates improved efficiency (5–50% fewer coalitions, up to ~95% in some regimes) with preserved accuracy across simulated Gaussian data and a real-world Red Wine dataset. Across XGBoost and linear-model scenarios, the paired c-kernel and PySHAP* c-kernel methods consistently outperform existing strategies, reducing variance and stabilizing weights, especially when many features are present. The work provides practical guidance for scalable Shapley-value explanations in high-dimensional tabular data and suggests avenues for extending these ideas to other Shapley-based explanations and theoretical analyses.
Abstract
In Explainable AI (XAI), Shapley values are a popular model-agnostic framework for explaining predictions made by complex machine learning models. The computation of Shapley values requires estimating non-trivial contribution functions representing predictions with only a subset of the features present. As the number of these terms grows exponentially with the number of features, computational costs escalate rapidly, creating a pressing need for efficient and accurate approximation methods. For tabular data, the KernelSHAP framework is considered the state-of-the-art model-agnostic approximation framework. KernelSHAP approximates the Shapley values using a weighted sample of the contribution functions for different feature subsets. We propose a novel modification of KernelSHAP which replaces the stochastic weights with deterministic ones to reduce the variance of the resulting Shapley value approximations. This may also be combined with our simple, yet effective modification to the KernelSHAP variant implemented in the popular Python library SHAP. Additionally, we provide an overview of established methods. Numerical experiments demonstrate that our methods can reduce the required number of contribution function evaluations by $5\%$ to $50\%$ while preserving the same accuracy of the approximated Shapley values -- essentially reducing the running time by up to $50\%$. These computational advancements push the boundaries of the feature dimensionality and number of predictions that can be accurately explained with Shapley values within a feasible runtime.
