SHAP values via sparse Fourier representation

Ali Gorji; Andisheh Amrollahi; Andreas Krause

SHAP values via sparse Fourier representation

Ali Gorji, Andisheh Amrollahi, Andreas Krause

TL;DR

The paper tackles the prohibitive cost of computing SHAP values for high-dimensional predictors by exploiting sparse Fourier representations. It introduces FourierShap, a two-stage framework that first fits a sparse Fourier spectrum to the predictor (exact for tree ensembles, approximate for black-box models) and then computes SHAP values with a closed-form expression for each Fourier basis, reducing the problem to a linear, highly parallelizable summation with complexity $\Theta(n|\mathcal{D}|k)$. The key theoretical contribution is the exact SHAP formula for individual Fourier basis functions, enabling amortization of the Fourier fit across many explanations and yielding orders-of-magnitude speedups over KernelShap and related methods, with controllable accuracy via sparsity $k$. Empirically, FourierShap achieves near-ground-truth fidelity (e.g., $R^2$ approaching 0.99 against TreeSHAP) while delivering substantial speed improvements across diverse datasets and model families, including black-box neural nets and white-box tree ensembles. This work provides a scalable, parallelizable approach to model explanations that preserves SHAP's theoretical guarantees under sparse Fourier representations, broadening practical applicability to large-scale settings.

Abstract

SHAP (SHapley Additive exPlanations) values are a widely used method for local feature attribution in interpretable and explainable AI. We propose an efficient two-stage algorithm for computing SHAP values in both black-box setting and tree-based models. Motivated by spectral bias in real-world predictors, we first approximate models using compact Fourier representations, exactly for trees and approximately for black-box models. In the second stage, we introduce a closed-form formula for {\em exactly} computing SHAP values using the Fourier representation, that ``linearizes'' the computation into a simple summation and is amenable to parallelization. As the Fourier approximation is computed only once, our method enables amortized SHAP value computation, achieving significant speedups over existing methods and a tunable trade-off between efficiency and precision.

SHAP values via sparse Fourier representation

TL;DR

. The key theoretical contribution is the exact SHAP formula for individual Fourier basis functions, enabling amortization of the Fourier fit across many explanations and yielding orders-of-magnitude speedups over KernelShap and related methods, with controllable accuracy via sparsity

. Empirically, FourierShap achieves near-ground-truth fidelity (e.g.,

approaching 0.99 against TreeSHAP) while delivering substantial speed improvements across diverse datasets and model families, including black-box neural nets and white-box tree ensembles. This work provides a scalable, parallelizable approach to model explanations that preserves SHAP's theoretical guarantees under sparse Fourier representations, broadening practical applicability to large-scale settings.

Abstract

Paper Structure (35 sections, 5 theorems, 20 equations, 5 figures, 4 tables)

This paper contains 35 sections, 5 theorems, 20 equations, 5 figures, 4 tables.

Introduction
Background
Shapley values
Fourier representations
Many real-world black-box predictors have sparse Fourier transforms
Spectral bias of fully connected neural networks
Sparsity of ensembles of decision trees
Computing SHAP values with Fourier representation of functions
Experiments
Relevant work
Neural network theory and simplicity biases
Sparse and low-degree Fourier transform algorithms
Shapley values in the context of Machine learning
Proofs
Proof of propositions
...and 20 more sections

Key Result

Proposition 1

Assume $h:\{0, 1\}^n \rightarrow \mathbb{R}$ can be decomposed as follows: $h(x) = \sum\limits_{i=1}^{p} h_i(x_{S_i}), S_i \subseteq [n]$. That is, each function $h_i: \{0,1\}^{|S_i|} \rightarrow \mathbb{R}$ depends on at most $|S_i|$ variables. Then, $h$ is $k=O( \sum\limits_{i=1}^p 2^{|S_i|})$-spa

Figures (5)

Figure 1: Step 1 of FourierShap: Accuracy of the Fourier transform (in approximating the black-box function) vs. runtime of the sparse Fourier algorithm. The accuracy is evaluated by $R^2$ score and comparing the outputs of the black-box and the Fourier representation on a uniformly generated random dataset on the Boolean cube $\{0,1\}^n$. For a fixed level of accuracy, higher depth trees require a higher number of Fourier coefficients $k$ therefore a higher runtime. For the case of trees, we eventually are able to reach a perfect approximation since the underlying function is truly sparse.
Figure 2: Speedup vs. Accuracy. Speedup of different algorithms is reported as a multiple compared to the runtime of KernelShap. Accuracy is quantified by the $R^2$-score against ground truth SHAP values. DeepLift is a white-box algorithm for neural networks. LinRegShap is a black-box algorithm and a variance-reduced version of KernelShap. FastShap, a black-box algorithm, is a trained MLP to output SHAP values given inputs in one forward pass. We test MLPs of three different sizes for FastShap. FourierShap is ours. We are 10-10000x faster than LinRegShap on all dataset/model variations. More notably, we outperform DeepLift in the neural network model setting even though we assume only query access (black-box setting). We achieve higher accuracy than FastShap in 3/4 settings, while enabling a fine-grained control over the speed-accuracy trade-off.
Figure 3: Speedup vs. depth of tree, for different algorithms, reported as a multiple compared to the runtime of TreeSHAP. FourierSHAP is ours. As other baselines we have a GPU implementation of TreeSHAP, FastTreeSHAP, and PLTreeSHAP. We achieve order of magnitude speedups over all depths on the Entacmaea and SGEMM datasets. We also achieve significant speedups in the other two datasets; however, the edge diminishes as the maximum depth increases.
Figure 4: Model accuracy of tree models evaluated on the test set for different depths
Figure 5: From left to right, 1- SHAP values generated by KernelSHAP, 2- SHAP values generated by FourierSHAP (ours), 3- and the ground truth SHAP values computed by the original exponential SHAP formula Equation\ref{['eqn:shap']} on the Entacmaea dataset. In 3, Computation of ground truth SHAP values using the exponential formula is possible due to the dataset containing all $2^{13}$ possible boolean feature vectors. 10 query points and a background dataset points are chosen at random and are of size 10. This figure shows that both KernelSHAP and our method compute exact ground truth SHAP values.

Theorems & Definitions (9)

Proposition 1
Proposition 2
Lemma 1
Proposition 2
proof
Proposition 2
proof
proof
proof

SHAP values via sparse Fourier representation

TL;DR

Abstract

SHAP values via sparse Fourier representation

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (9)