RankSHAP: Shapley Value Based Feature Attributions for Learning to Rank
Tanya Chowdhury, Yair Zick, James Allan
TL;DR
RankSHAP introduces an axiomatic, Shapley-value-based framework for learning-to-rank explanations, addressing inconsistencies observed in prior ranking attributions. By redefining ranking value functions with Relevance Sensitivity and Position Sensitivity and proving a Generalized Ranking Effectiveness Metric (GREM), the authors show that RankSHAP is the unique attribution method that satisfies Rank-Efficiency, Rank-Missingness, Rank-Symmetry, and Rank-Monotonicity. They provide a practical KernelSHAP-based approximation, demonstrate superior fidelity over baselines on MS MARCO and Robust04 using multiple rankers, and corroborate human-aligned intuition via a user study. The work also offers an axiomatic comparison of existing ranking attribution methods, highlighting RankSHAP’s additive, decomposable value function as essential for reliable explanations in IR. Overall, RankSHAP promises axiomatically grounded, generalizable, and interpretable explanations for IR ranking models, with planned release of code.
Abstract
Numerous works propose post-hoc, model-agnostic explanations for learning to rank, focusing on ordering entities by their relevance to a query through feature attribution methods. However, these attributions often weakly correlate or contradict each other, confusing end users. We adopt an axiomatic game-theoretic approach, popular in the feature attribution community, to identify a set of fundamental axioms that every ranking-based feature attribution method should satisfy. We then introduce Rank-SHAP, extending classical Shapley values to ranking. We evaluate the RankSHAP framework through extensive experiments on two datasets, multiple ranking methods and evaluation metrics. Additionally, a user study confirms RankSHAP's alignment with human intuition. We also perform an axiomatic analysis of existing rank attribution algorithms to determine their compliance with our proposed axioms. Ultimately, our aim is to equip practitioners with a set of axiomatically backed feature attribution methods for studying IR ranking models, that ensure generality as well as consistency.
