From SHAP Scores to Feature Importance Scores
Olivier Letoffe, Xuanxiang Huang, Nicholas Asher, Joao Marques-Silva
TL;DR
The paper reframes feature attribution in XAI through the lens of power indices from cooperative game theory, arguing that traditional SHAP scores can yield unsatisfactory feature rankings due to the underlying characteristic function. It introduces Feature Importance Scores (FISs), a generalized framework that instantiates template scores via different power-index templates (e.g., Shapley-Shubik, Banzhaf, Johnston, Deegan-Packel, Holler-Packel, Andjiga) and a priori voting power concepts, parameterized by an explanation problem. The authors propose a set of targeted properties (efficiency, symmetry, additivity, dummy, minimal monotonicity, γ-efficiency, independence from class labeling, relevancy consistency, and duality consistency) to evaluate FISs and present novel FISs, including duality-based and coverage-based scores, along with a thorough characterization of how these scores relate to the properties. The results illuminate when and why certain FISs outperform SHAP in producing reliable feature attributions and chart directions for selecting appropriate FISs based on desired axiomatic properties and computational trade-offs, bridging explainability, logic-based reasoning, and voting-power theory.
Abstract
A central goal of eXplainable Artificial Intelligence (XAI) is to assign relative importance to the features of a Machine Learning (ML) model given some prediction. The importance of this task of explainability by feature attribution is illustrated by the ubiquitous recent use of tools such as SHAP and LIME. Unfortunately, the exact computation of feature attributions, using the game-theoretical foundation underlying SHAP and LIME, can yield manifestly unsatisfactory results, that tantamount to reporting misleading relative feature importance. Recent work targeted rigorous feature attribution, by studying axiomatic aggregations of features based on logic-based definitions of explanations by feature selection. This paper shows that there is an essential relationship between feature attribution and a priori voting power, and that those recently proposed axiomatic aggregations represent a few instantiations of the range of power indices studied in the past. Furthermore, it remains unclear how some of the most widely used power indices might be exploited as feature importance scores (FISs), i.e. the use of power indices in XAI, and which of these indices would be the best suited for the purposes of XAI by feature attribution, namely in terms of not producing results that could be deemed as unsatisfactory. This paper proposes novel desirable properties that FISs should exhibit. In addition, the paper also proposes novel FISs exhibiting the proposed properties. Finally, the paper conducts a rigorous analysis of the best-known power indices in terms of the proposed properties.
