Table of Contents
Fetching ...

ExpandR: Teaching Dense Retrievers Beyond Queries with LLM Guidance

Sijia Yao, Pengcheng Huang, Zhenghao Liu, Yu Gu, Yukun Yan, Shi Yu, Ge Yu

TL;DR

ExpandR tackles the semantic gap in dense retrieval by jointly training an LLM and a dense retriever, using LLM-generated query expansions that are optimized under ranking-aware objectives. The framework decomposes the joint probability into retriever-driven and expansion-driven terms, with the LLM guided by reward modeling that combines self-consistency and retriever-oriented signals via Direct Preference Optimization. Empirically, ExpandR achieves consistent, cross-backbone gains on BEIR and MS MARCO, with notable improvements on challenging datasets and strong ablations confirming the importance of joint optimization and reward modeling. The work advances end-to-end alignment between generation and retrieval, offering a scalable approach to improve semantic matching in retrieval systems across diverse domains.

Abstract

Large language models (LLMs) have demonstrated significant potential in enhancing dense retrieval through query augmentation. However, most existing methods treat the LLM and the retriever as separate modules, overlooking the alignment between generation and ranking objectives. In this work, we propose ExpandR, a unified LLM-augmented dense retrieval framework that jointly optimizes both the LLM and the retriever. ExpandR employs the LLM to generate semantically rich query expansions, which are leveraged to enhance the retriever's training. Simultaneously, the LLM is trained using Direct Preference Optimization (DPO), guided by a carefully designed reward function that balances retrieval effectiveness and generation consistency. This joint optimization paradigm enables mutual adaptation between the LLM and the retriever, resulting in query expansions that are both informative and well-suited for retrieval. Experimental results on multiple benchmarks show that ExpandR consistently outperforms strong baselines, achieving more than a 5% improvement in retrieval performance. All codes are available at https://github.com/NEUIR/ExpandR.

ExpandR: Teaching Dense Retrievers Beyond Queries with LLM Guidance

TL;DR

ExpandR tackles the semantic gap in dense retrieval by jointly training an LLM and a dense retriever, using LLM-generated query expansions that are optimized under ranking-aware objectives. The framework decomposes the joint probability into retriever-driven and expansion-driven terms, with the LLM guided by reward modeling that combines self-consistency and retriever-oriented signals via Direct Preference Optimization. Empirically, ExpandR achieves consistent, cross-backbone gains on BEIR and MS MARCO, with notable improvements on challenging datasets and strong ablations confirming the importance of joint optimization and reward modeling. The work advances end-to-end alignment between generation and retrieval, offering a scalable approach to improve semantic matching in retrieval systems across diverse domains.

Abstract

Large language models (LLMs) have demonstrated significant potential in enhancing dense retrieval through query augmentation. However, most existing methods treat the LLM and the retriever as separate modules, overlooking the alignment between generation and ranking objectives. In this work, we propose ExpandR, a unified LLM-augmented dense retrieval framework that jointly optimizes both the LLM and the retriever. ExpandR employs the LLM to generate semantically rich query expansions, which are leveraged to enhance the retriever's training. Simultaneously, the LLM is trained using Direct Preference Optimization (DPO), guided by a carefully designed reward function that balances retrieval effectiveness and generation consistency. This joint optimization paradigm enables mutual adaptation between the LLM and the retriever, resulting in query expansions that are both informative and well-suited for retrieval. Experimental results on multiple benchmarks show that ExpandR consistently outperforms strong baselines, achieving more than a 5% improvement in retrieval performance. All codes are available at https://github.com/NEUIR/ExpandR.

Paper Structure

This paper contains 23 sections, 17 equations, 5 figures, 10 tables.

Figures (5)

  • Figure 1: Illustration of Our ExpandR Model. ExpandR optimizes both dense retriever and LLM using the LLM-guided contrastive training method and the ranking preference alignment method.
  • Figure 2: Embedding Visualization of Different Models.
  • Figure 3: Effect of Reward Modeling on the Semantic Alignment of Query Expansions.
  • Figure 4: Improvements of ExpandR in Both Unsupervised and Supervised Dense Retrievers. We plot the change of nDCG@10 scores before and after the query expansion using our ExpandR model.
  • Figure 5: Average Length of Query Expansions Generated by Different Models.