Table of Contents
Fetching ...

Discovering Biases in Information Retrieval Models Using Relevance Thesaurus as Global Explanation

Youngwoo Kim, Razieh Rahimi, James Allan

TL;DR

Global explanations for neural information retrieval models are addressed by constructing a relevance thesaurus of semantically related query-document term pairs and using a two-stage distillation to yield a sparse term-level retriever PaCE via PaRM. The approach defines a global score $S(q,d)$ and a term-pair table $M$ that capture term-level relevance, enabling faithful approximation of cross-encoder rankings while remaining interpretable. It augments lexical rankers like BM25 and uncovers biases such as brand-name bias, with fidelity and zero-shot semantic matching evaluated to demonstrate practical interpretability and robustness. This work advances interpretable first-stage retrieval by providing a scalable, global explanation framework for transformer-based ranking models.

Abstract

Most efforts in interpreting neural relevance models have focused on local explanations, which explain the relevance of a document to a query but are not useful in predicting the model's behavior on unseen query-document pairs. We propose a novel method to globally explain neural relevance models by constructing a "relevance thesaurus" containing semantically relevant query and document term pairs. This thesaurus is used to augment lexical matching models such as BM25 to approximate the neural model's predictions. Our method involves training a neural relevance model to score the relevance of partial query and document segments, which is then used to identify relevant terms across the vocabulary space. We evaluate the obtained thesaurus explanation based on ranking effectiveness and fidelity to the target neural ranking model. Notably, our thesaurus reveals the existence of brand name bias in ranking models, demonstrating one advantage of our explanation method.

Discovering Biases in Information Retrieval Models Using Relevance Thesaurus as Global Explanation

TL;DR

Global explanations for neural information retrieval models are addressed by constructing a relevance thesaurus of semantically related query-document term pairs and using a two-stage distillation to yield a sparse term-level retriever PaCE via PaRM. The approach defines a global score and a term-pair table that capture term-level relevance, enabling faithful approximation of cross-encoder rankings while remaining interpretable. It augments lexical rankers like BM25 and uncovers biases such as brand-name bias, with fidelity and zero-shot semantic matching evaluated to demonstrate practical interpretability and robustness. This work advances interpretable first-stage retrieval by providing a scalable, global explanation framework for transformer-based ranking models.

Abstract

Most efforts in interpreting neural relevance models have focused on local explanations, which explain the relevance of a document to a query but are not useful in predicting the model's behavior on unseen query-document pairs. We propose a novel method to globally explain neural relevance models by constructing a "relevance thesaurus" containing semantically relevant query and document term pairs. This thesaurus is used to augment lexical matching models such as BM25 to approximate the neural model's predictions. Our method involves training a neural relevance model to score the relevance of partial query and document segments, which is then used to identify relevant terms across the vocabulary space. We evaluate the obtained thesaurus explanation based on ranking effectiveness and fidelity to the target neural ranking model. Notably, our thesaurus reveals the existence of brand name bias in ranking models, demonstrating one advantage of our explanation method.
Paper Structure (5 sections, 5 equations, 2 tables)

This paper contains 5 sections, 5 equations, 2 tables.