PRISM: A Principled Framework for Multi-Agent Reasoning via Gain Decomposition
Yiming Yang, Zhuoyuan Li, Fanxiang Zeng, Hao Fu, Yue Liu
TL;DR
PRISM introduces a principled three-dimensional gain decomposition for multi-agent reasoning: Exploration, Information, and Aggregation. It then formalizes a four-phase framework (Propose, Execute, Review, Synthesize) that jointly optimizes all three dimensions, with theoretical guarantees on information sufficiency and convergence via a potential-game formulation. Empirically, PRISM achieves state-of-the-art performance across math, code, and tool-use benchmarks and demonstrates superior compute efficiency compared to partial-dimension approaches. The work provides actionable design principles and a diagnostic lens to guide future multi-agent systems toward balanced, scalable performance.
Abstract
Multi-agent collaboration has emerged as a promising paradigm for enhancing reasoning capabilities of Large Language Models (LLMs). However, existing approaches remain largely heuristic, lacking principled guidance on what drives performance gains and how to systematically optimize multi-agent reasoning. Specifically, it remains unclear why multi-agent collaboration outperforms single-agent reasoning and which design choices contribute most to these gains, making it difficult to build better systems. We address this gap by introducing a unified theoretical framework that decomposes multi-agent reasoning gains into three conceptually independent dimensions: Exploration for diverse solution coverage, Information for high-fidelity feedback, and Aggregation for principled consensus. Through this lens, existing methods can be understood as special cases that optimize only subsets of these dimensions. Building upon this decomposition, a novel framework called PRISM (Propose-Review-Integrate Synthesis for Multi-agent Reasoning) is proposed, which jointly maximizes all three dimensions through role-based diversity, execution-grounded feedback with evidence-based cross-evaluation, and iterative synthesis with closed-loop validation. Extensive experiments across mathematical reasoning, code generation, and function calling benchmarks demonstrate that PRISM achieves state-of-the-art performance with superior compute-efficiency compared to methods optimizing partial dimensions. The theoretical framework provides actionable design principles for future multi-agent reasoning systems.
