Consistent Individualized Feature Attribution for Tree Ensembles

Scott M. Lundberg; Gabriel G. Erion; Su-In Lee

Consistent Individualized Feature Attribution for Tree Ensembles

Scott M. Lundberg, Gabriel G. Erion, Su-In Lee

TL;DR

The paper tackles the problem of inconsistent feature attribution in tree ensembles and presents SHAP values as the unique, locally accurate additive attribution framework. It introduces Tree SHAP to compute exact SHAP values in polynomial time and extends the approach to SHAP interaction values, enabling robust main and interaction effects. Through user studies, performance benchmarks, and novel applications (supervised clustering, SHAP plots), the work demonstrates improved interpretability and scalability, with practical integration into XGBoost and LightGBM. This advances reliable, human-aligned explanations for complex tree-based models and provides tools that outperform traditional attribution methods in both accuracy and usability.

Abstract

Interpreting predictions from tree ensemble methods such as gradient boosting machines and random forests is important, yet feature attribution for trees is often heuristic and not individualized for each prediction. Here we show that popular feature attribution methods are inconsistent, meaning they can lower a feature's assigned importance when the true impact of that feature actually increases. This is a fundamental problem that casts doubt on any comparison between features. To address it we turn to recent applications of game theory and develop fast exact tree solutions for SHAP (SHapley Additive exPlanation) values, which are the unique consistent and locally accurate attribution values. We then extend SHAP values to interaction effects and define SHAP interaction values. We propose a rich visualization of individualized feature attributions that improves over classic attribution summaries and partial dependence plots, and a unique "supervised" clustering (clustering based on feature attributions). We demonstrate better agreement with human intuition through a user study, exponential improvements in run time, improved clustering performance, and better identification of influential features. An implementation of our algorithm has also been merged into XGBoost and LightGBM, see http://github.com/slundberg/shap for details.

Consistent Individualized Feature Attribution for Tree Ensembles

TL;DR

Abstract

Consistent Individualized Feature Attribution for Tree Ensembles

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (10)

Theorems & Definitions (1)