Table of Contents
Fetching ...

Shapley Value Approximation Based on k-Additive Games

Guilherme Dean Pelegrina, Patrick Kolpaczki, Eyke Hüllermeier

TL;DR

The paper tackles the computational bottleneck of Shapley value estimation in high-dimensional explainability tasks. It introduces SVA$k_{ADD}$, which fits a $k$-additive surrogate game to sampled coalitions so the surrogate's exact Shapley values provide estimates for the original game. The authors prove that, with weights $w_A^* = \binom{n-2}{|A|-1}^{-1}$, the approach recovers the true Shapley values for $k=1,2,3$ when all coalitions are observed, and they empirically compare to competitive baselines across diverse datasets and explanation types, showing robust performance and a favorable trade-off between expressiveness and sample efficiency. The method is domain- and task-agnostic, model-agnostic for explanations, and offers a practical tool for efficient, interpretable attribution in Explainable AI.

Abstract

The Shapley value is the prevalent solution for fair division problems in which a payout is to be divided among multiple agents. By adopting a game-theoretic view, the idea of fair division and the Shapley value can also be used in machine learning to quantify the individual contribution of features or data points to the performance of a predictive model. Despite its popularity and axiomatic justification, the Shapley value suffers from a computational complexity that scales exponentially with the number of entities involved, and hence requires approximation methods for its reliable estimation. We propose SVA$k_{\text{ADD}}$, a novel approximation method that fits a $k$-additive surrogate game. By taking advantage of $k$-additivity, we are able to elicit the exact Shapley values of the surrogate game and then use these values as estimates for the original fair division problem. The efficacy of our method is evaluated empirically and compared to competing methods.

Shapley Value Approximation Based on k-Additive Games

TL;DR

The paper tackles the computational bottleneck of Shapley value estimation in high-dimensional explainability tasks. It introduces SVA, which fits a -additive surrogate game to sampled coalitions so the surrogate's exact Shapley values provide estimates for the original game. The authors prove that, with weights , the approach recovers the true Shapley values for when all coalitions are observed, and they empirically compare to competitive baselines across diverse datasets and explanation types, showing robust performance and a favorable trade-off between expressiveness and sample efficiency. The method is domain- and task-agnostic, model-agnostic for explanations, and offers a practical tool for efficient, interpretable attribution in Explainable AI.

Abstract

The Shapley value is the prevalent solution for fair division problems in which a payout is to be divided among multiple agents. By adopting a game-theoretic view, the idea of fair division and the Shapley value can also be used in machine learning to quantify the individual contribution of features or data points to the performance of a predictive model. Despite its popularity and axiomatic justification, the Shapley value suffers from a computational complexity that scales exponentially with the number of entities involved, and hence requires approximation methods for its reliable estimation. We propose SVA, a novel approximation method that fits a -additive surrogate game. By taking advantage of -additivity, we are able to elicit the exact Shapley values of the surrogate game and then use these values as estimates for the original fair division problem. The efficacy of our method is evaluated empirically and compared to competing methods.

Paper Structure

This paper contains 26 sections, 2 theorems, 40 equations, 4 figures, 1 algorithm.

Key Result

Theorem 4.2

The solution to the $k$-additive optimization problem of any cooperative game $(N,\nu)$ for the cases of $k=1$, $k=2$, and $k=3$ with weights $w_A^* = \binom{n-2}{|A|-1}^{-1}$ yields the Shapley value, i.e.

Figures (4)

  • Figure 1: The sampled coalition values $\nu(A_1),\ldots,\nu(A_T)$ from the given game $(N,\nu)$ are used to fit a $k$-additive surrogate game $(N,\nu_k)$ in polynomial time. The Shapley values $\phi_1^k,\ldots,\phi_n^k$ of $(N,\nu_k)$ are obtained immediately from its $k$-additive representation. Since $\nu_k$ approximates $\nu$, these serve as estimates of the true Shapley values $\phi_1,\ldots,\phi_n$ of $(N,\nu)$.
  • Figure 2: MSE of SVA$k_{\text{ADD}}$ averaged over 100 repetitions in dependence of available budget $T$ for different additivity degrees $k$. Datasets stem from various explanation types: global (a)-(c), local (d)-(f), and unsupervised (g)-(i) with differing player numbers $n$.
  • Figure 3: MSE of SVA$k_{\text{ADD}}$ and competing methods averaged over 100 repetitions in dependence of available budget $T$. Datasets stem from various explanation types: global (a)-(c), local (d)-(f), and unsupervised (g)-(i) with differing player numbers $n$.
  • Figure 4: MSE of SVA$k_{\text{ADD}}$ and competing methods averaged over 100 repetitions in dependence of available sample budget $T$. Datasets stem from various explanation types (i) global (first row), (ii) local (second row), and unsupervised (third row) with differing player numbers $n$.

Theorems & Definitions (3)

  • Definition 4.1
  • Theorem 4.2
  • Lemma 1.1