QDER: Query-Specific Document and Entity Representations for Multi-Vector Document Re-Ranking
Shubham Chatterjee, Jeff Dalton
TL;DR
QDER proposes a unified, query-specific, entity-aware multi-vector re-ranking framework that preserves token- and entity-level representations through a late-aggregation approach. It leverages attention-guided interactions, bilinear cross-pattern scoring, and external lexical signals to produce discriminative query-focused embeddings, significantly improving ranking, especially for difficult queries. Empirical results across five benchmarks show substantial gains over strong baselines, with notable improvements on challenging topics and case studies illustrating robust entity attention dynamics. This work demonstrates that dynamically adapting document representations to each query, and integrating knowledge graph signals within a multi-vector architecture, can substantially advance neural IR performance with practical relevance for retrieval and RAG systems.
Abstract
Neural IR has advanced through two distinct paths: entity-oriented approaches leveraging knowledge graphs and multi-vector models capturing fine-grained semantics. We introduce QDER, a neural re-ranking model that unifies these approaches by integrating knowledge graph semantics into a multi-vector model. QDER's key innovation lies in its modeling of query-document relationships: rather than computing similarity scores on aggregated embeddings, we maintain individual token and entity representations throughout the ranking process, performing aggregation only at the final scoring stage - an approach we call "late aggregation." We first transform these fine-grained representations through learned attention patterns, then apply carefully chosen mathematical operations for precise matches. Experiments across five standard benchmarks show that QDER achieves significant performance gains, with improvements of 36% in nDCG@20 over the strongest baseline on TREC Robust 2004 and similar improvements on other datasets. QDER particularly excels on difficult queries, achieving an nDCG@20 of 0.70 where traditional approaches fail completely (nDCG@20 = 0.0), setting a foundation for future work in entity-aware retrieval.
