Shaping Up SHAP: Enhancing Stability through Layer-Wise Neighbor Selection

Gwladys Kelodjou; Laurence Rozé; Véronique Masson; Luis Galárraga; Romaric Gaudel; Maurice Tchuente; Alexandre Termier

Shaping Up SHAP: Enhancing Stability through Layer-Wise Neighbor Selection

Gwladys Kelodjou, Laurence Rozé, Véronique Masson, Luis Galárraga, Romaric Gaudel, Maurice Tchuente, Alexandre Termier

TL;DR

Shaping Up SHAP identifies instability in Kernel SHAP caused by stochastic neighbor selection and proposes ST-SHAP to enforce layer-wise stability, including full-layer determinism or Layer-1-only attributions. The authors derive a closed-form Layer-1 attribution formula, show it satisfies LES properties, and empirically validate high stability without fidelity loss, plus substantial speedups relative to exact SHAP. The approach yields a robust, scalable alternative for local explanations, with Layer-1 attributions offering near-SHAP fidelity at linear-time complexity $O(M)$. Across multiple datasets and models, ST-SHAP demonstrates stable explanations (often $Jaccard=1.0$ on complete layers) and strong agreement with exact SHAP values, suggesting practical impact for trustworthy, efficient model interpretability.

Abstract

Machine learning techniques, such as deep learning and ensemble methods, are widely used in various domains due to their ability to handle complex real-world tasks. However, their black-box nature has raised multiple concerns about the fairness, trustworthiness, and transparency of computer-assisted decision-making. This has led to the emergence of local post-hoc explainability methods, which offer explanations for individual decisions made by black-box algorithms. Among these methods, Kernel SHAP is widely used due to its model-agnostic nature and its well-founded theoretical framework. Despite these strengths, Kernel SHAP suffers from high instability: different executions of the method with the same inputs can lead to significantly different explanations, which diminishes the relevance of the explanations. The contribution of this paper is two-fold. On the one hand, we show that Kernel SHAP's instability is caused by its stochastic neighbor selection procedure, which we adapt to achieve full stability without compromising explanation fidelity. On the other hand, we show that by restricting the neighbors generation to perturbations of size 1 -- which we call the coalitions of Layer 1 -- we obtain a novel feature-attribution method that is fully stable, computationally efficient, and still meaningful.

Shaping Up SHAP: Enhancing Stability through Layer-Wise Neighbor Selection

TL;DR

. Across multiple datasets and models, ST-SHAP demonstrates stable explanations (often

on complete layers) and strong agreement with exact SHAP values, suggesting practical impact for trustworthy, efficient model interpretability.

Abstract

Paper Structure (28 sections, 2 theorems, 37 equations, 11 figures, 5 tables)

This paper contains 28 sections, 2 theorems, 37 equations, 11 figures, 5 tables.

Introduction and Related Work
Background
Shapley Values
SHAP Values
Kernel SHAP.
Reason for Instability
Improving Kernel SHAP’s Stability
Experiments
Experimental Protocol
Environment.
Datasets.
Black-Box Models.
Metrics.
Results
First Layer Attribution Analysis
...and 13 more sections

Key Result

Theorem 1

For any feature $j$, the attribution value $\phi_j$ computed by ST-SHAP when filling exactly Layer 1 is: where for any $i$, $\tilde{\phi}_i= \frac{f(\{i\})- f(\emptyset) + f(N) - f(N\backslash\{i\}) }{2}$.

Figures (11)

Figure 1: Illustration of the instability of SHAP on an instance from the Dry Bean dataset. The explanation size is 4, so each explanation computation yields four features with non-zero values.
Figure 2: Evolution of the Jaccard coefficient for SHAP and ST-SHAP on Boston, Dry Bean, HELOC, and Spambase datasets. The vertical lines represent the budgets that result in a complete layer.
Figure 3: Adherence ($R^2$-score for Boston and Accuracy for others) vs. budget size for Boston, Dry Bean, HELOC, and Spambase datasets.
Figure 4: Comparison results between SHAP and ST-SHAP across various criteria on the Boston dataset.
Figure 5: Comparison results between SHAP and ST-SHAP across various criteria on the Movie dataset.
...and 6 more figures

Theorems & Definitions (4)

Theorem 1: Attribution values with Layer 1
proof
Theorem : Attribution values with Layer 1
proof

Shaping Up SHAP: Enhancing Stability through Layer-Wise Neighbor Selection

TL;DR

Abstract

Shaping Up SHAP: Enhancing Stability through Layer-Wise Neighbor Selection

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (11)

Theorems & Definitions (4)