Table of Contents
Fetching ...

SVARM-IQ: Efficient Approximation of Any-order Shapley Interactions through Stratification

Patrick Kolpaczki, Maximilian Muschalik, Fabian Fumagalli, Barbara Hammer, Eyke Hüllermeier

TL;DR

SVARM-IQ addresses the computational intractability of high-order Shapley-based interaction indices by introducing a stratified, budget-aware sampling method that estimates CIIs for any order. It represents interactions through a stratified decomposition, enabling a single coalition evaluation to update multiple strata and all interaction estimates, with proven unbiasedness and non-asymptotic error bounds. Theoretical guarantees are complemented by extensive experiments across language and vision tasks, where SVARM-IQ outperforms baselines like SHAP-IQ and permutation sampling in MSE and Prec@10, demonstrating practical gains in explanation quality under realistic budgets. This work enables scalable, model-agnostic explanations that jointly quantify feature importance and high-order interactions, with clear implications for domains with strong feature correlations and complex interactions.

Abstract

Addressing the limitations of individual attribution scores via the Shapley value (SV), the field of explainable AI (XAI) has recently explored intricate interactions of features or data points. In particular, extensions of the SV, such as the Shapley Interaction Index (SII), have been proposed as a measure to still benefit from the axiomatic basis of the SV. However, similar to the SV, their exact computation remains computationally prohibitive. Hence, we propose with SVARM-IQ a sampling-based approach to efficiently approximate Shapley-based interaction indices of any order. SVARM-IQ can be applied to a broad class of interaction indices, including the SII, by leveraging a novel stratified representation. We provide non-asymptotic theoretical guarantees on its approximation quality and empirically demonstrate that SVARM-IQ achieves state-of-the-art estimation results in practical XAI scenarios on different model classes and application domains.

SVARM-IQ: Efficient Approximation of Any-order Shapley Interactions through Stratification

TL;DR

SVARM-IQ addresses the computational intractability of high-order Shapley-based interaction indices by introducing a stratified, budget-aware sampling method that estimates CIIs for any order. It represents interactions through a stratified decomposition, enabling a single coalition evaluation to update multiple strata and all interaction estimates, with proven unbiasedness and non-asymptotic error bounds. Theoretical guarantees are complemented by extensive experiments across language and vision tasks, where SVARM-IQ outperforms baselines like SHAP-IQ and permutation sampling in MSE and Prec@10, demonstrating practical gains in explanation quality under realistic budgets. This work enables scalable, model-agnostic explanations that jointly quantify feature importance and high-order interactions, with clear implications for domains with strong feature correlations and complex interactions.

Abstract

Addressing the limitations of individual attribution scores via the Shapley value (SV), the field of explainable AI (XAI) has recently explored intricate interactions of features or data points. In particular, extensions of the SV, such as the Shapley Interaction Index (SII), have been proposed as a measure to still benefit from the axiomatic basis of the SV. However, similar to the SV, their exact computation remains computationally prohibitive. Hence, we propose with SVARM-IQ a sampling-based approach to efficiently approximate Shapley-based interaction indices of any order. SVARM-IQ can be applied to a broad class of interaction indices, including the SII, by leveraging a novel stratified representation. We provide non-asymptotic theoretical guarantees on its approximation quality and empirically demonstrate that SVARM-IQ achieves state-of-the-art estimation results in practical XAI scenarios on different model classes and application domains.
Paper Structure (54 sections, 10 theorems, 78 equations, 10 figures, 2 tables, 5 algorithms)

This paper contains 54 sections, 10 theorems, 78 equations, 10 figures, 2 tables, 5 algorithms.

Key Result

Theorem 4.1

SVARM-IQ's CII estimates are unbiased for all $K \in \mathcal{N}_k$, i.e., $\mathbb{E} [\hat{I}_K] = I_K$.

Figures (10)

  • Figure 1: By dividing an ImageNet picture into multiple patches, attribution scores for single patches and interactions scores for pairs aid explaining a vision transformer.
  • Figure 2: Schematic overview of SVARM-IQ.
  • Figure 3: Approximation quality of SVARM-IQ (blue) compared to SHAP-IQ (pink) and permutation sampling (purple) baselines for estimating order $k=2,3$ SII on the LM (a; $n=14$) and the ViT (b; $n=16$). Shaded bands represent the standard error over 50, respectively 30 runs.
  • Figure 4: Comparison of SVARM-IQ and baselines for STI (left) and FSI (right) on the CNN. Shaded bands represent the standard error over 50 runs.
  • Figure 5: Comparison of ground-truth n-SII values of order $k=1$ and $k=2$ for the predicted class probability of a ViT for an ImageNet picture sliced into a grid of $n=16$ patches (left) against n-SII values estimated by SVARM-IQ (center) and permutation sampling (right). The exact computation requires 65,536 model evaluations while the budget of both approximators is limited by 5000, making up only 7.6% of the space to sample.
  • ...and 5 more figures

Theorems & Definitions (24)

  • Definition 2.1: Shapley Value Shapley.1953
  • Definition 2.2: Discrete Derivative DBLP:journals/geb/FujimotoKM06
  • Definition 2.3: Shapley Interaction Index Grabisch_Roubens_1999
  • Theorem 4.1
  • Theorem 4.2
  • Corollary 4.3
  • Corollary 4.4
  • Lemma E.1
  • proof
  • proof
  • ...and 14 more