RankingSHAP -- Listwise Feature Attribution Explanations for Ranking Models
Maria Heuss, Maarten de Rijke, Avishek Anand
TL;DR
RankingSHAP introduces listwise feature attribution for ranking models to address the limitations of pointwise explanations. It extends SHAP through listwise masking and flexible explanation objectives, enabling faithful, contrastive understanding of ranking decisions across ranked lists (e.g., Kendall's tau $\\tau$). The authors define evaluation paradigms (Preservation and Deletion) to assess attribution faithfulness on LtR benchmarks MQ2008 and MSLR, and provide a white-box toy example to illustrate interpretability benefits. Empirical results demonstrate that RankingSHAP yields faithful attributions and can reveal model biases, while acknowledging computational costs and interpretability challenges, with a public code repository for reproducibility.
Abstract
While SHAP (SHapley Additive exPlanations) and other feature attribution methods are commonly employed to explain model predictions, their application within information retrieval (IR), particularly for complex outputs such as ranked lists, remains limited. Existing attribution methods typically provide pointwise explanations, focusing on why a single document received a high-ranking score, rather than considering the relationships between documents in a ranked list. We present three key contributions to address this gap. First, we rigorously define listwise feature attribution for ranking models. Secondly, we introduce RankingSHAP, extending the popular SHAP framework to accommodate listwise ranking attribution, addressing a significant methodological gap in the field. Third, we propose two novel evaluation paradigms for assessing the faithfulness of attributions in learning-to-rank models, measuring the correctness and completeness of the explanation with respect to different aspects. Through experiments on standard learning-to-rank datasets, we demonstrate RankingSHAP's practical application while identifying the constraints of selection-based explanations. We further employ a simulated study with an interpretable model to showcase how listwise ranking attributions can be used to examine model decisions and conduct a qualitative evaluation of explanations. Due to the contrastive nature of the ranking task, our understanding of ranking model decisions can substantially benefit from feature attribution explanations like RankingSHAP.
