Towards Piece-by-Piece Explanations for Chess Positions with SHAP
Francesco Spinnato
TL;DR
This work addresses the opacity of centipawn-based chess engine evaluations by introducing a SHAP-based framework that treats each board piece as a feature and uses ablation to produce per-piece, additive attributions to the engine's output. The engine is reframed as a bounded win-probability predictor $f:X\to[0,1]$, with SHAP values $\phi_i$ computed for non-king pieces to decompose the prediction around a base value of $0.5$ for king-only positions; a legality-aware perturbation strategy ensures feasible inputs. The authors validate the approach through thematic examples and engine comparisons, highlighting intuitive and counterintuitive piece contributions and discussing limitations such as the king’s inaccessibility, potential non-causality, and computational scalability. The findings offer a principled, interpretable lens for pedagogy, training, and engine analysis, and point to future work in scalable hierarchical attributions and broader domain applications.
Abstract
Contemporary chess engines offer precise yet opaque evaluations, typically expressed as centipawn scores. While effective for decision-making, these outputs obscure the underlying contributions of individual pieces or patterns. In this paper, we explore adapting SHAP (SHapley Additive exPlanations) to the domain of chess analysis, aiming to attribute a chess engines evaluation to specific pieces on the board. By treating pieces as features and systematically ablating them, we compute additive, per-piece contributions that explain the engines output in a locally faithful and human-interpretable manner. This method draws inspiration from classical chess pedagogy, where players assess positions by mentally removing pieces, and grounds it in modern explainable AI techniques. Our approach opens new possibilities for visualization, human training, and engine comparison. We release accompanying code and data to foster future research in interpretable chess AI.
