Approximating the Shapley Value without Marginal Contributions
Patrick Kolpaczki, Viktor Bengs, Maximilian Muschalik, Eyke Hüllermeier
TL;DR
This paper addresses the practical difficulty of computing Shapley values by introducing budget-aware, domain-independent probabilistic approximations that avoid marginal contributions. SVARM uses two tailored sampling distributions to update multiple Shapley estimates per ν-evaluation, with rigorous unbiasedness and variance guarantees; Stratified SVARM further partitions coalitions by size to reduce variance and accelerate convergence. The authors provide comprehensive theoretical bounds, including $\mathbb{V}[\hat{\phi}_i] \le \frac{2 H_n}{\bar{T}}(\sigma_i^{+2} + \sigma_i^{-2})$ for SVARM and $\mathbb{V}[\hat{\phi}_i] \le \frac{2 \log n}{n \bar{T}} \sum_{\ell} (\sigma_{i,\ell}^{+2} + \sigma_{i,\ell+1}^{-2})$ for Stratified SVARM, along with Chebyshev and Hoeffding tail bounds. Empirically, SVARM variants, especially Stratified SVARM$^+$, outperform ApproShapley and KernelSHAP on synthetic games and explainability tasks (NLP, image, and tabular data), closely rivaling or exceeding state-of-the-art methods under fixed budgets. The work demonstrates that shattering the reliance on marginal contributions and re-framing Shapley estimation around coalition-value access can yield efficient, accurate, and anytime approximations with practical impact for explainable AI and beyond.
Abstract
The Shapley value, which is arguably the most popular approach for assigning a meaningful contribution value to players in a cooperative game, has recently been used intensively in explainable artificial intelligence. Its meaningfulness is due to axiomatic properties that only the Shapley value satisfies, which, however, comes at the expense of an exact computation growing exponentially with the number of agents. Accordingly, a number of works are devoted to the efficient approximation of the Shapley value, most of them revolve around the notion of an agent's marginal contribution. In this paper, we propose with SVARM and Stratified SVARM two parameter-free and domain-independent approximation algorithms based on a representation of the Shapley value detached from the notion of marginal contribution. We prove unmatched theoretical guarantees regarding their approximation quality and provide empirical results including synthetic games as well as common explainability use cases comparing ourselves with state-of-the-art methods.
