Plan*RAG: Efficient Test-Time Planning for Retrieval Augmented Generation

Prakhar Verma; Sukruta Prakash Midigeshi; Gaurav Sinha; Arno Solin; Nagarajan Natarajan; Amit Sharma

Plan*RAG: Efficient Test-Time Planning for Retrieval Augmented Generation

Prakhar Verma, Sukruta Prakash Midigeshi, Gaurav Sinha, Arno Solin, Nagarajan Natarajan, Amit Sharma

TL;DR

Plan*RAG introduces test-time planning by externalizing a reasoning plan as a directed acyclic graph (DAG) that guides multi-hop retrieval-augmented generation. By decomposing queries into atomic, dynamically linked subqueries and enabling parallel execution, Plan*RAG achieves higher accuracy on standard multi-hop benchmarks while maintaining comparable compute to baseline RAG methods. The approach integrates with existing RAG frameworks (e.g., Self-RAG) and demonstrates that a reasonably sized reasoning planner, including a fine-tuned Llama model, can match larger language models in planning quality. The work provides a practical, modular framework for robust multi-hop reasoning with explicit verification opportunities and bounded context usage, potentially benefiting critical knowledge-intensive applications.

Abstract

We introduce Plan*RAG, a novel framework that enables structured multi-hop reasoning in retrieval-augmented generation (RAG) through test-time reasoning plan generation. While existing approaches such as ReAct maintain reasoning chains within the language model's context window, we observe that this often leads to plan fragmentation and execution failures. Our key insight is that by isolating the reasoning plan as a directed acyclic graph (DAG) outside the LM's working memory, we can enable (1) systematic exploration of reasoning paths, (2) atomic subqueries enabling precise retrievals and grounding, and (3) efficiency through parallel execution and bounded context window utilization. Moreover, Plan*RAG's modular design allows it to be integrated with existing RAG methods, thus providing a practical solution to improve current RAG systems. On standard multi-hop reasoning benchmarks, Plan*RAG consistently achieves improvements over recently proposed methods such as RQ-RAG and Self-RAG, while maintaining comparable computational costs.

Plan*RAG: Efficient Test-Time Planning for Retrieval Augmented Generation

TL;DR

Abstract

Plan*RAG: Efficient Test-Time Planning for Retrieval Augmented Generation

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (4)