Table of Contents
Fetching ...

GainRAG: Preference Alignment in Retrieval-Augmented Generation through Gain Signal Synthesis

Yi Jiang, Sendong Zhao, Jianbo Li, Haochun Wang, Bing Qin

TL;DR

This paper tackles the misalignment between retrieved content and LLM preferences in retrieval-augmented generation by introducing GainRAG, a gain-signal–driven framework. It defines a gain metric based on perplexity and a contrastive decoding regime to quantify how much a passage helps produce correct outputs, and trains a light-weight selector using limited data augmented with a pseudo-passage to avoid degradation. The approach ships a GainRAG inference workflow that selects the highest-gain passage to feed into the LLM, accompanied by a distillation-based training regime for the selector. Empirical results across six datasets demonstrate strong performance and robust generalization, highlighting the practical value of gain-guided alignment for RAG systems.

Abstract

The Retrieval-Augmented Generation (RAG) framework introduces a retrieval module to dynamically inject retrieved information into the input context of large language models (LLMs), and has demonstrated significant success in various NLP tasks. However, the current study points out that there is a preference gap between retrievers and LLMs in the RAG framework, which limit the further improvement of system performance. Some highly relevant passages may interfere with LLM reasoning because they contain complex or contradictory information; while some indirectly related or even inaccurate content may help LLM generate more accurate answers by providing suggestive information or logical clues. To solve this, we propose GainRAG, a novel approach that aligns the retriever's and LLM's preferences by defining a new metric, "gain", which measure how well an input passage contributes to correct outputs. Specifically, we propose a method to estimate these gain signals and train a middleware that aligns the preferences of the retriever and the LLM using only limited data. In addition, we introduce a pseudo-passage strategy to mitigate degradation. The experimental results on 6 datasets verify the effectiveness of GainRAG.

GainRAG: Preference Alignment in Retrieval-Augmented Generation through Gain Signal Synthesis

TL;DR

This paper tackles the misalignment between retrieved content and LLM preferences in retrieval-augmented generation by introducing GainRAG, a gain-signal–driven framework. It defines a gain metric based on perplexity and a contrastive decoding regime to quantify how much a passage helps produce correct outputs, and trains a light-weight selector using limited data augmented with a pseudo-passage to avoid degradation. The approach ships a GainRAG inference workflow that selects the highest-gain passage to feed into the LLM, accompanied by a distillation-based training regime for the selector. Empirical results across six datasets demonstrate strong performance and robust generalization, highlighting the practical value of gain-guided alignment for RAG systems.

Abstract

The Retrieval-Augmented Generation (RAG) framework introduces a retrieval module to dynamically inject retrieved information into the input context of large language models (LLMs), and has demonstrated significant success in various NLP tasks. However, the current study points out that there is a preference gap between retrievers and LLMs in the RAG framework, which limit the further improvement of system performance. Some highly relevant passages may interfere with LLM reasoning because they contain complex or contradictory information; while some indirectly related or even inaccurate content may help LLM generate more accurate answers by providing suggestive information or logical clues. To solve this, we propose GainRAG, a novel approach that aligns the retriever's and LLM's preferences by defining a new metric, "gain", which measure how well an input passage contributes to correct outputs. Specifically, we propose a method to estimate these gain signals and train a middleware that aligns the preferences of the retriever and the LLM using only limited data. In addition, we introduce a pseudo-passage strategy to mitigate degradation. The experimental results on 6 datasets verify the effectiveness of GainRAG.

Paper Structure

This paper contains 31 sections, 12 equations, 8 figures, 6 tables, 2 algorithms.

Figures (8)

  • Figure 1: We analyze the preference gap between retrieved passages and LLMs on 2 datasets: HotpotQA and 2Wiki2MultiHopQA. The top shows the proportion of correct and incorrect generations when the retrieved passage contains the gold answer. The bottom shows the proportion of whether the passage used contains the golden answer when the LLM response is correct.
  • Figure 2: Illustration of the GainRAG framework. The GainRAG workflow, preference signal synthesis, and selector distillation fine-tuning are shown respectively.
  • Figure 3: Illustration of gain. Changes in recall of the gold answer and downstream performance after using GainRAG.
  • Figure 4: Illustration of the pseudo-passages generated for each dataset to avoid degenerate solutions.
  • Figure 5: As the number of passages increases, the changes in recall and downstream generation performance. The left part is the change of our selector, the right side is the BGE-reranker, and the upper and lower parts are recall and Avg(EM, F1) respectively.
  • ...and 3 more figures