SHAP scores fail pervasively even when Lipschitz succeeds

Olivier Letoffe; Xuanxiang Huang; Joao Marques-Silva

SHAP scores fail pervasively even when Lipschitz succeeds

Olivier Letoffe, Xuanxiang Huang, Joao Marques-Silva

TL;DR

The paper systematically shows that SHAP scores, while popular, can be thoroughly unsatisfactory not only for Boolean classifiers but also for regression, including Lipschitz-continuous and arbitrarily differentiable cases. By formalizing an explanation framework with abductive and contrastive explanations and leveraging a generalized similarity notion, the authors construct infinite families of counterexamples across finite and uncountable codomains. They demonstrate that SHAP scores can attribute nonzero importance to irrelevant features or assign zero importance to truly influential ones, even under Lipschitz continuity intended to bolster robustness. The results argue for reevaluating SHAP-based explainability in light of these fundamental limitations and motivate the search for alternative measures of feature importance with stronger theoretical guarantees and broader applicability.

Abstract

The ubiquitous use of Shapley values in eXplainable AI (XAI) has been triggered by the tool SHAP, and as a result are commonly referred to as SHAP scores. Recent work devised examples of machine learning (ML) classifiers for which the computed SHAP scores are thoroughly unsatisfactory, by allowing human decision-makers to be misled. Nevertheless, such examples could be perceived as somewhat artificial, since the selected classes must be interpreted as numeric. Furthermore, it was unclear how general were the issues identified with SHAP scores. This paper answers these criticisms. First, the paper shows that for Boolean classifiers there are arbitrarily many examples for which the SHAP scores must be deemed unsatisfactory. Second, the paper shows that the issues with SHAP scores are also observed in the case of regression models. In addition, the paper studies the class of regression models that respect Lipschitz continuity, a measure of a function's rate of change that finds important recent uses in ML, including model robustness. Concretely, the paper shows that the issues with SHAP scores occur even for regression models that respect Lipschitz continuity. Finally, the paper shows that the same issues are guaranteed to exist for arbitrarily differentiable regression models.

SHAP scores fail pervasively even when Lipschitz succeeds

TL;DR

Abstract

SHAP scores fail pervasively even when Lipschitz succeeds

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (33)