ReaRAG: Knowledge-guided Reasoning Enhances Factuality of Large Reasoning Models with Iterative Retrieval Augmented Generation
Zhicheng Lee, Shulin Cao, Jinxin Liu, Jiajie Zhang, Weichuan Liu, Xiaoyin Che, Lei Hou, Juanzi Li
TL;DR
ReaRAG introduces a knowledge-guided reasoning framework that grounds large reasoning models in external knowledge through iterative retrieval with a bounded reasoning chain. By distilling reasoning capabilities into a dedicated dataset and training under a Thought-Action-Observation paradigm, ReaRAG achieves strong factuality without RL-based policy optimization. The approach demonstrates competitive or superior performance on six QA benchmarks, particularly on harder multi-hop tasks like FRAMES and FanOutQA, while offering efficiency advantages over RL methods. The work highlights the value of reflection and error-correction in grounding reasoning trajectories, though it acknowledges limitations in action space, data construction efficiency, and inference latency. Overall, ReaRAG provides a practical, scalable alternative for improving factuality in retrieval-augmented reasoning while maintaining robust performance.
Abstract
Large Reasoning Models (LRMs) exhibit remarkable reasoning abilities but rely primarily on parametric knowledge, limiting factual accuracy. While recent works equip reinforcement learning (RL)-based LRMs with retrieval capabilities, they suffer from overthinking and lack robustness in reasoning, reducing their effectiveness in question answering (QA) tasks. To address this, we propose ReaRAG, a factuality-enhanced reasoning model that explores diverse queries without excessive iterations. Our solution includes a novel data construction framework with an upper bound on the reasoning chain length. Specifically, we first leverage an LRM to generate deliberate thinking, then select an action from a predefined action space (Search and Finish). For Search action, a query is executed against the RAG engine, where the result is returned as observation to guide reasoning steps later. This process iterates until a Finish action is chosen. Benefiting from ReaRAG's strong reasoning capabilities, our approach outperforms existing baselines on multi-hop QA. Further analysis highlights its strong reflective ability to recognize errors and refine its reasoning trajectory. Our study enhances LRMs' factuality while effectively integrating robust reasoning for Retrieval-Augmented Generation (RAG).
