Table of Contents
Fetching ...

PRISM: A Principled Framework for Multi-Agent Reasoning via Gain Decomposition

Yiming Yang, Zhuoyuan Li, Fanxiang Zeng, Hao Fu, Yue Liu

TL;DR

PRISM introduces a principled three-dimensional gain decomposition for multi-agent reasoning: Exploration, Information, and Aggregation. It then formalizes a four-phase framework (Propose, Execute, Review, Synthesize) that jointly optimizes all three dimensions, with theoretical guarantees on information sufficiency and convergence via a potential-game formulation. Empirically, PRISM achieves state-of-the-art performance across math, code, and tool-use benchmarks and demonstrates superior compute efficiency compared to partial-dimension approaches. The work provides actionable design principles and a diagnostic lens to guide future multi-agent systems toward balanced, scalable performance.

Abstract

Multi-agent collaboration has emerged as a promising paradigm for enhancing reasoning capabilities of Large Language Models (LLMs). However, existing approaches remain largely heuristic, lacking principled guidance on what drives performance gains and how to systematically optimize multi-agent reasoning. Specifically, it remains unclear why multi-agent collaboration outperforms single-agent reasoning and which design choices contribute most to these gains, making it difficult to build better systems. We address this gap by introducing a unified theoretical framework that decomposes multi-agent reasoning gains into three conceptually independent dimensions: Exploration for diverse solution coverage, Information for high-fidelity feedback, and Aggregation for principled consensus. Through this lens, existing methods can be understood as special cases that optimize only subsets of these dimensions. Building upon this decomposition, a novel framework called PRISM (Propose-Review-Integrate Synthesis for Multi-agent Reasoning) is proposed, which jointly maximizes all three dimensions through role-based diversity, execution-grounded feedback with evidence-based cross-evaluation, and iterative synthesis with closed-loop validation. Extensive experiments across mathematical reasoning, code generation, and function calling benchmarks demonstrate that PRISM achieves state-of-the-art performance with superior compute-efficiency compared to methods optimizing partial dimensions. The theoretical framework provides actionable design principles for future multi-agent reasoning systems.

PRISM: A Principled Framework for Multi-Agent Reasoning via Gain Decomposition

TL;DR

PRISM introduces a principled three-dimensional gain decomposition for multi-agent reasoning: Exploration, Information, and Aggregation. It then formalizes a four-phase framework (Propose, Execute, Review, Synthesize) that jointly optimizes all three dimensions, with theoretical guarantees on information sufficiency and convergence via a potential-game formulation. Empirically, PRISM achieves state-of-the-art performance across math, code, and tool-use benchmarks and demonstrates superior compute efficiency compared to partial-dimension approaches. The work provides actionable design principles and a diagnostic lens to guide future multi-agent systems toward balanced, scalable performance.

Abstract

Multi-agent collaboration has emerged as a promising paradigm for enhancing reasoning capabilities of Large Language Models (LLMs). However, existing approaches remain largely heuristic, lacking principled guidance on what drives performance gains and how to systematically optimize multi-agent reasoning. Specifically, it remains unclear why multi-agent collaboration outperforms single-agent reasoning and which design choices contribute most to these gains, making it difficult to build better systems. We address this gap by introducing a unified theoretical framework that decomposes multi-agent reasoning gains into three conceptually independent dimensions: Exploration for diverse solution coverage, Information for high-fidelity feedback, and Aggregation for principled consensus. Through this lens, existing methods can be understood as special cases that optimize only subsets of these dimensions. Building upon this decomposition, a novel framework called PRISM (Propose-Review-Integrate Synthesis for Multi-agent Reasoning) is proposed, which jointly maximizes all three dimensions through role-based diversity, execution-grounded feedback with evidence-based cross-evaluation, and iterative synthesis with closed-loop validation. Extensive experiments across mathematical reasoning, code generation, and function calling benchmarks demonstrate that PRISM achieves state-of-the-art performance with superior compute-efficiency compared to methods optimizing partial dimensions. The theoretical framework provides actionable design principles for future multi-agent reasoning systems.
Paper Structure (45 sections, 16 theorems, 28 equations, 3 figures, 5 tables, 1 algorithm)

This paper contains 45 sections, 16 theorems, 28 equations, 3 figures, 5 tables, 1 algorithm.

Key Result

Theorem 3.1

The following decomposition provides a principled framework that identifies three independently optimizable mechanisms. Under Assumptions (A1)--(A4): where $C_K := P(\bigvee_{k=1}^K Q(\tau^{(k)}) = 1)$ is the coverage probability (at least one correct proposal exists) and $\eta(f, s) := P(Q(\tau^{\text{MAS}}) = 1 \mid \exists k\!: Q(\tau^{(k)}) = 1)$ is the selection accuracy under feedback signa

Figures (3)

  • Figure 1: The gain decomposition theory. Like a prism decomposes light, our method decomposes multi-agent gains into three conceptually independent dimensions: Exploration (diverse sampling), Information (high-fidelity feedback), and Aggregation (principled consensus).
  • Figure 2: Overview of the PRISM framework. PRISM implements a four-phase workflow to jointly maximize all three gain dimensions: Propose targets exploration gain through diverse candidate generation; Execute and Review jointly maximize information gain through grounded feedback and evidence-based cross-evaluation; Synthesize maximizes aggregation gain through iterative refinement with closed-loop validation.
  • Figure 3: Equal-budget efficiency frontier on MBPP. Each curve traces a method's Pareto envelope---the best accuracy achievable at each token budget---ensuring a strictly fair comparison across methods. PRISM dominates all baselines across the cost spectrum, matching MoA's 84.2% ceiling at ${\sim}$5$\times$ less cost and scaling to 88.8%.

Theorems & Definitions (27)

  • Theorem 3.1: Gain Decomposition
  • Remark 3.1: Conceptual Orthogonality vs. Realized Subadditivity
  • Proposition 3.2: Exploration via Diversity
  • Proposition 3.3: Information Quality Bounds
  • Remark 3.2: Task Classification and Information Gain
  • Proposition 3.4: Aggregation Efficiency
  • Theorem 3.5: PRISM Convergence and Optimality
  • Example 3.1
  • Definition A.1: Reasoning Task
  • Definition A.2: Multi-Agent System
  • ...and 17 more