Modularized Transfomer-based Ranking Framework
Luyu Gao, Zhuyun Dai, Jamie Callan
TL;DR
The paper introduces MORES, a modular Transformer-based ranking framework that decouples text representation from query-document interaction into three dedicated modules. This design enables offline pre-computation of document representations and lightweight online interaction, achieving competitive effectiveness to BERT with substantial speedups (up to ~170×) and improved interpretability through attention analysis. Domain adaptation experiments show the modular design transfers well, with representations requiring domain-specific tuning while interaction patterns remain relatively general. Together, these results advance efficient, interpretable neural IR and provide insights into the distinct roles of representation and interaction in Transformer rankers.
Abstract
Recent innovations in Transformer-based ranking models have advanced the state-of-the-art in information retrieval. However, these Transformers are computationally expensive, and their opaque hidden states make it hard to understand the ranking process. In this work, we modularize the Transformer ranker into separate modules for text representation and interaction. We show how this design enables substantially faster ranking using offline pre-computed representations and light-weight online interactions. The modular design is also easier to interpret and sheds light on the ranking process in Transformer rankers.
