REGENT: Relevance-Guided Attention for Entity-Aware Multi-Vector Neural Re-Ranking
Shubham Chatterjee
TL;DR
REGENT tackles long-form information needs by embedding relevance signals directly into neural attention through a dual-pathway architecture that blends token-level BM25 lexical cues with query-specific entity semantics. The approach introduces relevance-guided attention, token-level BM25 integration, and dynamic entity-aware processing, achieving state-of-the-art re-ranking on three long-document benchmarks with up to 108% improvement over BM25 and notable gains over ColBERT and RankVicuna. Ablation studies show the entity semantic skeleton as the core driver of performance, with lexical signals providing important fine-grained grounding. This work establishes a new paradigm for entity-aware neural IR by tightly weaving lexical and semantic signals into neural attention, enabling robust handling of complex queries in long documents.
Abstract
Current neural re-rankers often struggle with complex information needs and long, content-rich documents. The fundamental issue is not computational--it is intelligent content selection: identifying what matters in lengthy, multi-faceted texts. While humans naturally anchor their understanding around key entities and concepts, neural models process text within rigid token windows, treating all interactions as equally important and missing critical semantic signals. We introduce REGENT, a neural re-ranking model that mimics human-like understanding by using entities as a "semantic skeleton" to guide attention. REGENT integrates relevance guidance directly into the attention mechanism, combining fine-grained lexical matching with high-level semantic reasoning. This relevance-guided attention enables the model to focus on conceptually important content while maintaining sensitivity to precise term matches. REGENT achieves new state-of-the-art performance in three challenging datasets, providing up to 108% improvement over BM25 and consistently outperforming strong baselines including ColBERT and RankVicuna. To our knowledge, this is the first work to successfully integrate entity semantics directly into neural attention, establishing a new paradigm for entity-aware information retrieval.
