Table of Contents
Fetching ...

RaFe: Ranking Feedback Improves Query Rewriting for RAG

Shengyu Mao, Yong Jiang, Boli Chen, Xiao Li, Peng Wang, Xinyu Wang, Pengjun Xie, Fei Huang, Huajun Chen, Ningyu Zhang

TL;DR

RaFe addresses the lack of generalizable feedback signals for query rewriting in RAG by leveraging a public reranker to provide ranking-based feedback without requiring annotated data. It employs a two-stage process: an initial supervised fine-tuning stage to learn rewrites, followed by offline or online feedback training that uses reranker scores to align rewrites with retrieval objectives. Across English and Chinese open-domain QA benchmarks, RaFe yields consistent improvements, notably in Expand-Ranked scenarios, demonstrating strong cross-lingual transfer and practical efficiency. The work suggests promising directions for joint training of rerankers and rewrite models, while acknowledging limitations in cross-domain validation and reliance on the availability of effective rerankers.

Abstract

As Large Language Models (LLMs) and Retrieval Augmentation Generation (RAG) techniques have evolved, query rewriting has been widely incorporated into the RAG system for downstream tasks like open-domain QA. Many works have attempted to utilize small models with reinforcement learning rather than costly LLMs to improve query rewriting. However, current methods require annotations (e.g., labeled relevant documents or downstream answers) or predesigned rewards for feedback, which lack generalization, and fail to utilize signals tailored for query rewriting. In this paper, we propose ours, a framework for training query rewriting models free of annotations. By leveraging a publicly available reranker, ours~provides feedback aligned well with the rewriting objectives. Experimental results demonstrate that ours~can obtain better performance than baselines.

RaFe: Ranking Feedback Improves Query Rewriting for RAG

TL;DR

RaFe addresses the lack of generalizable feedback signals for query rewriting in RAG by leveraging a public reranker to provide ranking-based feedback without requiring annotated data. It employs a two-stage process: an initial supervised fine-tuning stage to learn rewrites, followed by offline or online feedback training that uses reranker scores to align rewrites with retrieval objectives. Across English and Chinese open-domain QA benchmarks, RaFe yields consistent improvements, notably in Expand-Ranked scenarios, demonstrating strong cross-lingual transfer and practical efficiency. The work suggests promising directions for joint training of rerankers and rewrite models, while acknowledging limitations in cross-domain validation and reliance on the availability of effective rerankers.

Abstract

As Large Language Models (LLMs) and Retrieval Augmentation Generation (RAG) techniques have evolved, query rewriting has been widely incorporated into the RAG system for downstream tasks like open-domain QA. Many works have attempted to utilize small models with reinforcement learning rather than costly LLMs to improve query rewriting. However, current methods require annotations (e.g., labeled relevant documents or downstream answers) or predesigned rewards for feedback, which lack generalization, and fail to utilize signals tailored for query rewriting. In this paper, we propose ours, a framework for training query rewriting models free of annotations. By leveraging a publicly available reranker, ours~provides feedback aligned well with the rewriting objectives. Experimental results demonstrate that ours~can obtain better performance than baselines.
Paper Structure (58 sections, 10 equations, 8 figures, 12 tables)

This paper contains 58 sections, 10 equations, 8 figures, 12 tables.

Figures (8)

  • Figure 1: Illustration of query rewriting for RAG. The left part indicates the normal RAG pipeline, while the right part presents the query rewriting to expand more relevant documents for RAG.
  • Figure 2: The overview of RaFe. The entire procedure consists of two stages: the initial SFT, and subsequent feedback training. RaFe obtains ranking feedback aligned with the goal of query rewriting without annotated data and enables leveraging the feedback in two ways. Offline training: Constructing good-bad pairs from offline-generated data. Online training: Scoring queries generated in real-time and complete feedback training.
  • Figure 3: Three types of examples, including the original query and rewrites from SFT and RaFe. The Prec@5 results of queries and rewrites are presented, and "Correct" denotes that whether the prediction is correct or not.
  • Figure 4: The performance of different rewrite models before and after all the documents are reranked under Expand setting. The number displayed on each bar represents the specific improvement from Raw to Ranked.
  • Figure 5: The results of different rewrite nums in Expand setting. We list the result from 0 to 5 rewrites. The rewrites are generate by RaFe$_{(KTO)}$.
  • ...and 3 more figures