Table of Contents
Fetching ...

Optimizing Retrieval for RAG via Reinforced Contrastive Learning

Jiawei Zhou, Lei Chen

TL;DR

This work tackles the IR–RAG relevance gap by introducing R3, a Retrieval framework optimized for Retrieval-Augmented Generation through Reinforced Contrastive Learning. By coupling on-policy retrieval with a semi-parametric retriever (SiDR), offline probability assessments, and a reinforced contrastive loss that jointly updates parametric and semi-parametric components, R3 effectively learns environment-specific relevance within RAG. Empirical results across multiple QA benchmarks show consistent gains over state-of-the-art retrievers and competitive performance with LLM-augmented retrieval, while maintaining efficiency (4 GPUs, within one day). The study also provides rigorous ablations and cost analyses, highlighting the value and limits of environment-specific retrieval learning for RAG systems.

Abstract

As retrieval-augmented generation (RAG) becomes increasingly widespread, the role of information retrieval (IR) is shifting from retrieving information for human users to retrieving contextual knowledge for artificial intelligence (AI) systems, where relevance becomes difficult to define or annotate beforehand. To address this challenge, we propose R3, a Retrieval framework optimized for RAG through trialand-feedback Reinforced contrastive learning. Unlike prior approaches that rely on annotated or synthetic data for supervised fine-tuning, R3 enables the retriever to dynamically explore and optimize relevance within the RAG environment. During training, the retrieved results interact with the environment to produce contrastive signals that automatically guide the retriever's self-improvement. Extensive experiments across diverse tasks demonstrate that R3 improves RAG performance by 5.2% over the original retriever and surpasses state-of-the-art retrievers by 4.9%, while achieving comparable results to LLM-augmented retrieval and RAG systems built on post-trained or instruction-tuned LLMs. It is both efficient and practical, requiring only 4 GPUs and completing training within a single day.

Optimizing Retrieval for RAG via Reinforced Contrastive Learning

TL;DR

This work tackles the IR–RAG relevance gap by introducing R3, a Retrieval framework optimized for Retrieval-Augmented Generation through Reinforced Contrastive Learning. By coupling on-policy retrieval with a semi-parametric retriever (SiDR), offline probability assessments, and a reinforced contrastive loss that jointly updates parametric and semi-parametric components, R3 effectively learns environment-specific relevance within RAG. Empirical results across multiple QA benchmarks show consistent gains over state-of-the-art retrievers and competitive performance with LLM-augmented retrieval, while maintaining efficiency (4 GPUs, within one day). The study also provides rigorous ablations and cost analyses, highlighting the value and limits of environment-specific retrieval learning for RAG systems.

Abstract

As retrieval-augmented generation (RAG) becomes increasingly widespread, the role of information retrieval (IR) is shifting from retrieving information for human users to retrieving contextual knowledge for artificial intelligence (AI) systems, where relevance becomes difficult to define or annotate beforehand. To address this challenge, we propose R3, a Retrieval framework optimized for RAG through trialand-feedback Reinforced contrastive learning. Unlike prior approaches that rely on annotated or synthetic data for supervised fine-tuning, R3 enables the retriever to dynamically explore and optimize relevance within the RAG environment. During training, the retrieved results interact with the environment to produce contrastive signals that automatically guide the retriever's self-improvement. Extensive experiments across diverse tasks demonstrate that R3 improves RAG performance by 5.2% over the original retriever and surpasses state-of-the-art retrievers by 4.9%, while achieving comparable results to LLM-augmented retrieval and RAG systems built on post-trained or instruction-tuned LLMs. It is both efficient and practical, requiring only 4 GPUs and completing training within a single day.

Paper Structure

This paper contains 29 sections, 8 equations, 14 figures, 8 tables.

Figures (14)

  • Figure 1: Comparison of traditional IR setting and RAG setting.
  • Figure 2: Illustration of the R3 training process.
  • Figure 3: Overview illustration of SiDR.
  • Figure 4: Ablation of offline and online learning.
  • Figure 5: Ablation on re-indexing strategies.
  • ...and 9 more figures