Table of Contents
Fetching ...

Reinforced Information Retrieval

Chaofan Li, Zheng Liu, Jianlyv Chen, Defu Lian, Yingxia Shao

TL;DR

Reinforced-IR tackles cross-domain retrieval by jointly adapting a pre-trained retriever and an LLM-based generator through a self-boosting loop. The method alternates generator optimization (RLRF) and retriever optimization (RLGF) using an unlabeled target-domain corpus, employing generation-augmented queries with a proximity-based objective and knowledge-distillation-style feedback. Key contributions include the first joint retriever-generator framework for end-to-end optimization, the design of DPO-based generator training, and a proximity-guided retriever objective that leverages generator outputs. Extensive experiments on BEIR and AIR-Bench show substantial improvements over HyDE, Doc2Query, QGen, GPL, and base retrievers, particularly in low-resource domains, highlighting the practical potential for domain-adaptive retrieval systems.

Abstract

While retrieval techniques are widely used in practice, they still face significant challenges in cross-domain scenarios. Recently, generation-augmented methods have emerged as a promising solution to this problem. These methods enhance raw queries by incorporating additional information from an LLM-based generator, facilitating more direct retrieval of relevant documents. However, existing methods struggle with highly specialized situations that require extensive domain expertise. To address this problem, we present \textbf{Reinforced-IR}, a novel approach that jointly adapts a pre-trained retriever and generator for precise cross-domain retrieval. A key innovation of Reinforced-IR is its \textbf{Self-Boosting} framework, which enables retriever and generator to learn from each other's feedback. Specifically, the generator is reinforced to generate query augmentations that enhance the retriever's performance, while the retriever is trained to better discriminate the relevant documents identified by the generator. This iterative process allows the end-to-end retrieval performance to be progressively optimized using an unlabeled corpus from the target domain. In our experiment, Reinforced-IR outperforms existing domain adaptation methods by a large margin, leading to substantial improvements in retrieval quality across a wide range of application scenarios.

Reinforced Information Retrieval

TL;DR

Reinforced-IR tackles cross-domain retrieval by jointly adapting a pre-trained retriever and an LLM-based generator through a self-boosting loop. The method alternates generator optimization (RLRF) and retriever optimization (RLGF) using an unlabeled target-domain corpus, employing generation-augmented queries with a proximity-based objective and knowledge-distillation-style feedback. Key contributions include the first joint retriever-generator framework for end-to-end optimization, the design of DPO-based generator training, and a proximity-guided retriever objective that leverages generator outputs. Extensive experiments on BEIR and AIR-Bench show substantial improvements over HyDE, Doc2Query, QGen, GPL, and base retrievers, particularly in low-resource domains, highlighting the practical potential for domain-adaptive retrieval systems.

Abstract

While retrieval techniques are widely used in practice, they still face significant challenges in cross-domain scenarios. Recently, generation-augmented methods have emerged as a promising solution to this problem. These methods enhance raw queries by incorporating additional information from an LLM-based generator, facilitating more direct retrieval of relevant documents. However, existing methods struggle with highly specialized situations that require extensive domain expertise. To address this problem, we present \textbf{Reinforced-IR}, a novel approach that jointly adapts a pre-trained retriever and generator for precise cross-domain retrieval. A key innovation of Reinforced-IR is its \textbf{Self-Boosting} framework, which enables retriever and generator to learn from each other's feedback. Specifically, the generator is reinforced to generate query augmentations that enhance the retriever's performance, while the retriever is trained to better discriminate the relevant documents identified by the generator. This iterative process allows the end-to-end retrieval performance to be progressively optimized using an unlabeled corpus from the target domain. In our experiment, Reinforced-IR outperforms existing domain adaptation methods by a large margin, leading to substantial improvements in retrieval quality across a wide range of application scenarios.

Paper Structure

This paper contains 16 sections, 8 equations, 2 figures, 5 tables.

Figures (2)

  • Figure 1: Reinforced-IR jointly adapts retriever and generator with an unlabeled domain corpus via self-boosting. The well-adapted generator augments raw query with hypothetical docs, which enables relevant docs to be retrieved.
  • Figure 2: Self-Boosting workflow. 1) RLRF: the generator is reinforced to produce the retriever's preferred query augmentation (marked by thumb-up) through DPO. 2) RLGF: the retriever is reinforced to discriminate the generator's preferred documents (measured by preference score $w_{-}$) in the form of knowledge distillation.