FREESON: Retriever-Free Retrieval-Augmented Reasoning via Corpus-Traversing MCTS
Chaeeun Kim, Seungone Kim
TL;DR
FREESON tackles the bottleneck of external retrievers in retrieval-augmented reasoning by enabling a single large reasoning model to act as both generator and retriever. It introduces Corpus-Traversing Monte Carlo Tree Search (CT-MCTS), a token-level, prefix-constrained search that traverses a corpus via a dynamically built CorpusTree, with multi-token node expansions and an on-policy value network to guide retrieval toward answer-containing regions. Through on-policy CT-MCTS rollouts and synthetic-data–augmented training, FREESON achieves an average EM/F1 improvement of 14.4% over multi-step baselines with separate retrievers across five open-domain QA benchmarks, and outperforms many baselines, sometimes surpassing the strongest by a few percentage points. The approach eliminates the need for external search engines or retriever hardware, offering a retriever-free pathway for knowledge-intensive reasoning and a foundation for domain-specific applications with unlabeled corpora. Limitations include dependence on a predefined corpus and potential extensions using reinforcement learning to further optimize traversal strategies.
Abstract
Large Reasoning Models (LRMs) have demonstrated remarkable capabilities in multi-step reasoning and calling search engines at appropriate steps. However, existing retrieval-augmented reasoning approaches rely on separate retrieval models, limiting the LRM's role in retrieval to deciding when to retrieve and how to query. This separation not only increases hardware and operational costs but also leads to errors in the retrieval process due to the representation bottleneck, a phenomenon where the retriever's embedding space is not expressive enough to meet the generator's requirements. To address this, we shift our perspective from sequence-to-sequence matching to locating the answer-containing paths within the corpus, and propose a novel framework called FREESON (Retriever-FREE Retrieval-Augmented ReaSONing). This framework enables LRMs to retrieve relevant knowledge on their own by acting as both a generator and retriever. To achieve this, we introduce a variant of the MCTS algorithm specialized for the retrieval task, which we call CT-MCTS (Corpus-Traversing Monte Carlo Tree Search). In this algorithm, LRMs traverse through the corpus toward answer-containing regions. Our results on five open-domain QA benchmarks, including single-hop and multi-hop questions, show that FREESON achieves an average improvement of 14.4% in EM and F1 over four multi-step reasoning models with a separate retriever, and it also performs comparably to the strongest baseline, surpassing it by 3% on PopQA and 2WikiMultihopQA.
