FrugalRAG: Less is More in RL Finetuning for Multi-Hop Question Answering

Abhinav Java; Srivathsan Koundinyan; Nagarajan Natarajan; Amit Sharma

FrugalRAG: Less is More in RL Finetuning for Multi-Hop Question Answering

Abhinav Java, Srivathsan Koundinyan, Nagarajan Natarajan, Amit Sharma

TL;DR

FrugalRAG is proposed, a two-stage finetuning framework that adaptively reduces the number of retrieval steps based on a question's difficulty, and attains state-of-the-art efficiency-accuracy tradeoffs, cutting retrieval cost nearly in half.

Abstract

Reinforcement learning (RL) based on the final answer's reward has driven recent progress in small language models (SLMs) on reasoning-heavy tasks such as math and code. However, applying the same techniques to retrieval-augmented generation (RAG) benchmarks like multi-hop QA has yielded limited gains, often trailing supervised or prompting-only baselines. Instead, we argue that a viable path for RL in multi-hop QA is to use test-time scaling judiciously to optimize both final answer accuracy and efficiency in reaching that answer. We propose FrugalRAG, a two-stage finetuning framework that adaptively reduces the number of retrieval steps based on a question's difficulty. First, we train an SLM with supervised finetuning on a full-exploration policy that generates broad sub-queries. Then, we apply RL to adaptively prune search depth based on question difficulty, directly rewarding policies that balance correctness with frugality. Unlike prior approaches requiring 10x more data, our method achieves competitive performance with only approximately 1,000 examples. On HotPotQA and other multi-hop QA benchmarks, FrugalRAG attains state-of-the-art efficiency-accuracy tradeoffs, cutting retrieval cost nearly in half. Moreover, on the challenging BrowseCompPlus benchmark, it generalizes zero-shot and surpasses SLM-based and other baselines. These results demonstrate the use of RL not to increase reasoning steps, but to reduce them, as an effective solution for scalable and efficient RAG.

FrugalRAG: Less is More in RL Finetuning for Multi-Hop Question Answering

TL;DR

Abstract

FrugalRAG: Less is More in RL Finetuning for Multi-Hop Question Answering

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (3)