Table of Contents
Fetching ...

Explainable Information Retrieval: A Survey

Avishek Anand, Lijun Lyu, Maximilian Idahl, Yumeng Wang, Jonas Wallat, Zijian Zhang

TL;DR

The paper surveys explainable information retrieval (XIR), addressing the need for transparency amid opaque neural ranking models and biased data. It offers a unified taxonomy spanning post-hoc and interpretable-by-design approaches, including feature attribution, free-text explanations, probing, axiomatic analysis, and rationale-based methods, while discussing grounding in IR principles and evaluation strategies. It highlights open challenges such as ground-truth scarcity, fidelity of explanations, and the balance between performance and interpretability, and it outlines practical design considerations for trustworthy, knowledge-intensive search. Overall, the work provides a comprehensive framework to guide researchers and practitioners in building auditable, user-centric retrieval systems and identifies key directions for future, rigorous evaluation and development.

Abstract

Explainable information retrieval is an emerging research area aiming to make transparent and trustworthy information retrieval systems. Given the increasing use of complex machine learning models in search systems, explainability is essential in building and auditing responsible information retrieval models. This survey fills a vital gap in the otherwise topically diverse literature of explainable information retrieval. It categorizes and discusses recent explainability methods developed for different application domains in information retrieval, providing a common framework and unifying perspectives. In addition, it reflects on the common concern of evaluating explanations and highlights open challenges and opportunities.

Explainable Information Retrieval: A Survey

TL;DR

The paper surveys explainable information retrieval (XIR), addressing the need for transparency amid opaque neural ranking models and biased data. It offers a unified taxonomy spanning post-hoc and interpretable-by-design approaches, including feature attribution, free-text explanations, probing, axiomatic analysis, and rationale-based methods, while discussing grounding in IR principles and evaluation strategies. It highlights open challenges such as ground-truth scarcity, fidelity of explanations, and the balance between performance and interpretability, and it outlines practical design considerations for trustworthy, knowledge-intensive search. Overall, the work provides a comprehensive framework to guide researchers and practitioners in building auditable, user-centric retrieval systems and identifies key directions for future, rigorous evaluation and development.

Abstract

Explainable information retrieval is an emerging research area aiming to make transparent and trustworthy information retrieval systems. Given the increasing use of complex machine learning models in search systems, explainability is essential in building and auditing responsible information retrieval models. This survey fills a vital gap in the otherwise topically diverse literature of explainable information retrieval. It categorizes and discusses recent explainability methods developed for different application domains in information retrieval, providing a common framework and unifying perspectives. In addition, it reflects on the common concern of evaluating explanations and highlights open challenges and opportunities.
Paper Structure (71 sections, 3 equations, 10 figures, 6 tables)

This paper contains 71 sections, 3 equations, 10 figures, 6 tables.

Figures (10)

  • Figure 1: Categorization of explainable IR approaches, where ยง indicates the section the approach is discussed.
  • Figure 2: Example ranking result showing top-5 ranked documents with predicted relevance scores for the query "can you do yoga from a chair". Query and Documents are selected from TREC-DL (2021) craswell:2021:trec-dl and MS MARCO nguyen:2016:msmarco, respectively.
  • Figure 3: A fictive example using a heatmap to visualize feature attributions for the top-2 ranked documents for the query "can you do yoga from a chair". Feature importance is highlighted in orange.
  • Figure 4: Example visualization of feature attributions for a single query-document pair using the BERT-style input format, which is "[CLS] query [SEP] document [SEP]". Important tokens are highlighted in orange.
  • Figure 5: Example bar chart visualization of feature attributions for different groups of tokens.
  • ...and 5 more figures