The Explanation Game -- Rekindled (Extended Version)
Joao Marques-Silva, Xuanxiang Huang, Olivier Letoffe
TL;DR
The paper tackles the reliability of SHAP-based explanations in XAI by exposing fundamental flaws in the traditional SHAP_T construction that relies on expected-value characteristic functions. It introduces a rigorously defined alternative SHAP framework using a monotone characteristic function $v_a$, paired with a data-based, sample-driven approach (sbXps) and a robust Shapley-estimation procedure (CGT) that guarantees bounded error with high confidence. The key contributions include (i) a new, non-misleading SHAP definition that preserves connections between feature-attribution and feature-selection explanations, (ii) a scalable estimation pipeline with polynomial-time guarantees and zero attribution for irrelevant features, and (iii) comprehensive experiments on boolean, tabular, and image data showing substantial improvements over SHAP in ranking fidelity and practicality. The findings have significant practical impact for trustworthy XAI, offering a scalable, theory-backed method that provides more accurate feature importance rankings in real-world applications.
Abstract
Recent work demonstrated the existence of critical flaws in the current use of Shapley values in explainable AI (XAI), i.e. the so-called SHAP scores. These flaws are significant in that the scores provided to a human decision-maker can be misleading. Although these negative results might appear to indicate that Shapley values ought not be used in XAI, this paper argues otherwise. Concretely, this paper proposes a novel definition of SHAP scores that overcomes existing flaws. Furthermore, the paper outlines a practically efficient solution for the rigorous estimation of the novel SHAP scores. Preliminary experimental results confirm our claims, and further underscore the flaws of the current SHAP scores.
