ReAgent: Reversible Multi-Agent Reasoning for Knowledge-Enhanced Multi-Hop QA
Xinjie Zhao, Fan Gao, Xingyu Song, Yingjian Chen, Rui Yang, Yanran Fu, Yuyang Wang, Yusuke Iwasawa, Yutaka Matsuo, Irene Li
TL;DR
ReAgent introduces a reversible multi-agent framework for knowledge-enhanced multi-hop QA that integrates explicit local and global backtracking to mitigate error propagation in forward reasoning. The architecture couples Execution, Supervisory, and Interaction layers with specialized agents for decomposition, retrieval, verification, and assembly, and coordinates backtracking through a Verifier, Controller, and Supervisor to resolve intra- and inter-agent contradictions. Empirical results on HotpotQA, 2WikiMultiHopQA, and Musique show ~6% average improvements over strong baselines, along with improved interpretability due to traceable backtracking. The work advances robust, error-tolerant QA by enabling corrections mid-reasoning and providing a foundation for scalable, trustworthy collaborative AI systems.
Abstract
Recent advances in large language models (LLMs) have significantly improved multi-hop question answering (QA) through direct Chain-of-Thought (CoT) reasoning. However, the irreversible nature of CoT leads to error accumulation, making it challenging to correct mistakes in multi-hop reasoning. This paper introduces ReAgent: a Reversible multi-Agent collaborative framework augmented with explicit backtracking mechanisms, enabling reversible multi-hop reasoning. By incorporating text-based retrieval, information aggregation and validation, our system can detect and correct errors mid-reasoning, leading to more robust and interpretable QA outcomes. The framework and experiments serve as a foundation for future work on error-tolerant QA systems. Empirical evaluations across three benchmarks indicate ReAgent's efficacy, yielding average about 6\% improvements against baseline models.
