Approximating the Shapley Value without Marginal Contributions

Patrick Kolpaczki; Viktor Bengs; Maximilian Muschalik; Eyke Hüllermeier

Approximating the Shapley Value without Marginal Contributions

Patrick Kolpaczki, Viktor Bengs, Maximilian Muschalik, Eyke Hüllermeier

TL;DR

This paper addresses the practical difficulty of computing Shapley values by introducing budget-aware, domain-independent probabilistic approximations that avoid marginal contributions. SVARM uses two tailored sampling distributions to update multiple Shapley estimates per ν-evaluation, with rigorous unbiasedness and variance guarantees; Stratified SVARM further partitions coalitions by size to reduce variance and accelerate convergence. The authors provide comprehensive theoretical bounds, including $\mathbb{V}[\hat{\phi}_i] \le \frac{2 H_n}{\bar{T}}(\sigma_i^{+2} + \sigma_i^{-2})$ for SVARM and $\mathbb{V}[\hat{\phi}_i] \le \frac{2 \log n}{n \bar{T}} \sum_{\ell} (\sigma_{i,\ell}^{+2} + \sigma_{i,\ell+1}^{-2})$ for Stratified SVARM, along with Chebyshev and Hoeffding tail bounds. Empirically, SVARM variants, especially Stratified SVARM$^+$, outperform ApproShapley and KernelSHAP on synthetic games and explainability tasks (NLP, image, and tabular data), closely rivaling or exceeding state-of-the-art methods under fixed budgets. The work demonstrates that shattering the reliance on marginal contributions and re-framing Shapley estimation around coalition-value access can yield efficient, accurate, and anytime approximations with practical impact for explainable AI and beyond.

Abstract

The Shapley value, which is arguably the most popular approach for assigning a meaningful contribution value to players in a cooperative game, has recently been used intensively in explainable artificial intelligence. Its meaningfulness is due to axiomatic properties that only the Shapley value satisfies, which, however, comes at the expense of an exact computation growing exponentially with the number of agents. Accordingly, a number of works are devoted to the efficient approximation of the Shapley value, most of them revolve around the notion of an agent's marginal contribution. In this paper, we propose with SVARM and Stratified SVARM two parameter-free and domain-independent approximation algorithms based on a representation of the Shapley value detached from the notion of marginal contribution. We prove unmatched theoretical guarantees regarding their approximation quality and provide empirical results including synthetic games as well as common explainability use cases comparing ourselves with state-of-the-art methods.

Approximating the Shapley Value without Marginal Contributions

TL;DR

for SVARM and

for Stratified SVARM, along with Chebyshev and Hoeffding tail bounds. Empirically, SVARM variants, especially Stratified SVARM

, outperform ApproShapley and KernelSHAP on synthetic games and explainability tasks (NLP, image, and tabular data), closely rivaling or exceeding state-of-the-art methods under fixed budgets. The work demonstrates that shattering the reliance on marginal contributions and re-framing Shapley estimation around coalition-value access can yield efficient, accurate, and anytime approximations with practical impact for explainable AI and beyond.

Abstract

Paper Structure (41 sections, 28 theorems, 95 equations, 9 figures, 1 table, 8 algorithms)

This paper contains 41 sections, 28 theorems, 95 equations, 9 figures, 1 table, 8 algorithms.

Introduction
Contribution.
Related Work
Problem Statement
SVARM
Theoretical analysis.
Stratified SVARM
Exact calculation.
Refined warm-up.
Enhanced update rule.
Theoretical analysis.
Empirical Results
Synthetic games
Explainabality games
Conclusion
...and 26 more sections

Key Result

Theorem 1

The Shapley value estimate $\hat{\phi}_i$ of any $i \in \mathcal{N}$ obtained by SVARM is unbiased, i.e.,

Figures (9)

Figure 1: Illustration of SVARM's sampling process and update rule: Each player $i$ has two urns $U_i^+ := \{S \cup \{i\} \mid S \subseteq \mathcal{N}_i\}$ and $U_i^- := \{S \mid S \subseteq \mathcal{N}_i\}$ containing marbles which represent coalitions, with mean coalition worth $\phi_i^+$ and $\phi_i^-$. SVARM alternates between sampling coalitions $A^+ \sim P^+$ and $A^- \sim P^-$. With each drawn coalition all estimates of those urns are updated which contain the corresponding marble. Since each player's two urns form a partition of the powerset $\mathcal{P}(\mathcal{N})$, all players have exactly one urn updated with each sample.
Figure 2: Illustration of Stratified SVARM's sampling process and update rule: Each player i has urns $U_{i,\ell}^+ := \{S \cup \{i\} \mid S \subseteq \mathcal{N}_i, |S|=\ell\}$ and $U_{i,\ell}^- := \{S \mid S \subseteq \mathcal{N}_i, |S| = \ell \}$ for all $\ell \in \{0,\ldots,n-1\}$, $2n$ in total, containing marbles which represent coalitions, with mean coalition worth $\phi_{i,\ell}^+$ and $\phi_{i,\ell}^-$. Stratified SVARM samples in each time step $t$ a coalition $A_t \subseteq \mathcal{N}$ and updates the estimates of all players' urns that contain the corresponding marble. Since each player's urns form a partition of the powerset $\mathcal{P}(\mathcal{N})$, all players have exactly one urn updated with each sample.
Figure 3: Averaged MSE and standard errors over 100 repetitions in dependence of fixed budget $T$: (1) Airport game, (2) Shoe game, (3) SOUG game, (4) NLP sentiment analysis, (5) Image classifier, (6) Adult classification.
Figure 4: Airport game with 100 players: Averaged MSE over 100 repetitions in dependence of fixed budget T, shaded bands showing standard errors.
Figure 5: SOUG game with 20 players: Averaged MSE over 100 repetitions in dependence of fixed budget T, shaded bands showing standard errors.
...and 4 more figures

Theorems & Definitions (53)

Theorem 1
Theorem 2
Corollary 1
Theorem 3
Theorem 4
Theorem 5
Theorem 6
Corollary 2
Theorem 7
Theorem 8
...and 43 more

Approximating the Shapley Value without Marginal Contributions

TL;DR

Abstract

Approximating the Shapley Value without Marginal Contributions

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (9)

Theorems & Definitions (53)