Table of Contents
Fetching ...

Modularized Transfomer-based Ranking Framework

Luyu Gao, Zhuyun Dai, Jamie Callan

TL;DR

The paper introduces MORES, a modular Transformer-based ranking framework that decouples text representation from query-document interaction into three dedicated modules. This design enables offline pre-computation of document representations and lightweight online interaction, achieving competitive effectiveness to BERT with substantial speedups (up to ~170×) and improved interpretability through attention analysis. Domain adaptation experiments show the modular design transfers well, with representations requiring domain-specific tuning while interaction patterns remain relatively general. Together, these results advance efficient, interpretable neural IR and provide insights into the distinct roles of representation and interaction in Transformer rankers.

Abstract

Recent innovations in Transformer-based ranking models have advanced the state-of-the-art in information retrieval. However, these Transformers are computationally expensive, and their opaque hidden states make it hard to understand the ranking process. In this work, we modularize the Transformer ranker into separate modules for text representation and interaction. We show how this design enables substantially faster ranking using offline pre-computed representations and light-weight online interactions. The modular design is also easier to interpret and sheds light on the ranking process in Transformer rankers.

Modularized Transfomer-based Ranking Framework

TL;DR

The paper introduces MORES, a modular Transformer-based ranking framework that decouples text representation from query-document interaction into three dedicated modules. This design enables offline pre-computation of document representations and lightweight online interaction, achieving competitive effectiveness to BERT with substantial speedups (up to ~170×) and improved interpretability through attention analysis. Domain adaptation experiments show the modular design transfers well, with representations requiring domain-specific tuning while interaction patterns remain relatively general. Together, these results advance efficient, interpretable neural IR and provide insights into the distinct roles of representation and interaction in Transformer rankers.

Abstract

Recent innovations in Transformer-based ranking models have advanced the state-of-the-art in information retrieval. However, these Transformers are computationally expensive, and their opaque hidden states make it hard to understand the ranking process. In this work, we modularize the Transformer ranker into separate modules for text representation and interaction. We show how this design enables substantially faster ranking using offline pre-computed representations and light-weight online interactions. The modular design is also easier to interpret and sheds light on the ranking process in Transformer rankers.

Paper Structure

This paper contains 22 sections, 8 equations, 2 figures, 6 tables.

Figures (2)

  • Figure 1: An illustration of the attention within a mores model using two layers of Interaction Blocks ($2\times$ IB). Representation Modules only show 1 layer of attention due to space limits. In a real model, Document Representation Module and Query Representation Module are deeper than shown here.
  • Figure 2: Visualization of attention in mores's Representation and Interaction Modules.