Table of Contents
Fetching ...

Pathway to Relevance: How Cross-Encoders Implement a Semantic Variant of BM25

Meng Lu, Catherine Chen, Carsten Eickhoff

TL;DR

This work analyzes how a BERT-based cross-encoder estimates document relevance and asks whether traditional IR signals like $TF$ and $IDF$ are embedded in neural models. Using mechanistic interpretability, the authors causal-ly locate BM25-like components: soft-TF in early/middle layers, $IDF$ encoded in a dominant embedding direction via a low-rank representation, and a BM25-style aggregation in later layers. They formalize a BM25-like linear function and demonstrate a strong linear-model fit to cross-encoder scores with $r = 0.8157$ ($p < 0.001$), exceeding a tuned BM25 baseline ($r = 0.4200$) and generalizing across 12 IR datasets. The findings reveal a two-stage relevance computation with practical implications for targeted model editing, personalization, bias mitigation, and parameter-efficient adaptation, advancing transparency and controllability in transformer-based IR.

Abstract

Mechanistic interpretation has greatly contributed to a more detailed understanding of generative language models, enabling significant progress in identifying structures that implement key behaviors through interactions between internal components. In contrast, interpretability in information retrieval (IR) remains relatively coarse-grained, and much is still unknown as to how IR models determine whether a document is relevant to a query. In this work, we address this gap by mechanistically analyzing how one commonly used model, a cross-encoder, estimates relevance. We find that the model extracts traditional relevance signals, such as term frequency and inverse document frequency, in early-to-middle layers. These concepts are then combined in later layers, similar to the well-known probabilistic ranking function, BM25. Overall, our analysis offers a more nuanced understanding of how IR models compute relevance. Isolating these components lays the groundwork for future interventions that could enhance transparency, mitigate safety risks, and improve scalability.

Pathway to Relevance: How Cross-Encoders Implement a Semantic Variant of BM25

TL;DR

This work analyzes how a BERT-based cross-encoder estimates document relevance and asks whether traditional IR signals like and are embedded in neural models. Using mechanistic interpretability, the authors causal-ly locate BM25-like components: soft-TF in early/middle layers, encoded in a dominant embedding direction via a low-rank representation, and a BM25-style aggregation in later layers. They formalize a BM25-like linear function and demonstrate a strong linear-model fit to cross-encoder scores with (), exceeding a tuned BM25 baseline () and generalizing across 12 IR datasets. The findings reveal a two-stage relevance computation with practical implications for targeted model editing, personalization, bias mitigation, and parameter-efficient adaptation, advancing transparency and controllability in transformer-based IR.

Abstract

Mechanistic interpretation has greatly contributed to a more detailed understanding of generative language models, enabling significant progress in identifying structures that implement key behaviors through interactions between internal components. In contrast, interpretability in information retrieval (IR) remains relatively coarse-grained, and much is still unknown as to how IR models determine whether a document is relevant to a query. In this work, we address this gap by mechanistically analyzing how one commonly used model, a cross-encoder, estimates relevance. We find that the model extracts traditional relevance signals, such as term frequency and inverse document frequency, in early-to-middle layers. These concepts are then combined in later layers, similar to the well-known probabilistic ranking function, BM25. Overall, our analysis offers a more nuanced understanding of how IR models compute relevance. Isolating these components lays the groundwork for future interventions that could enhance transparency, mitigate safety risks, and improve scalability.

Paper Structure

This paper contains 39 sections, 3 equations, 22 figures, 2 tables.

Figures (22)

  • Figure 1: Overview of relevance mechanisms in the model. The model first jointly analyzes query and document tokens to identify matching terms (exact and semantic), then contextualizes each query term within the full query, and finally calculates a relevance score by weighting query terms by their importance (IDF) and aggregating them, similar to BM25.
  • Figure 2: Path patching identifies heads 10.1, 10.4, 10.7, and 10.10 as the most important carriers of soft-TF signals to [CLS] on both TFC1 and STMC1, with similar patching effects.
  • Figure 3: Example of the attention pattern from [CLS] to query tokens in the Relevance Scoring Heads, illustrating how these heads process soft-TF for specific query tokens based on their IDF values.
  • Figure 4: Example attention pattern of Matching Head 4.9: tokens attend most strongly to duplicates but also mildly to similar tokens.
  • Figure 5: $U_0$ editing example (transposed for visualization). Since $U_0$ is negatively correlated with IDF, increasing tok1's IDF, representing its importance in relevance computation, requires negatively scaling its $U_0$ component.
  • ...and 17 more figures