Roundtable Policy: Confidence-Weighted-Consensus Aggregation Improves Multi-Agent-System Reasoning
Yu Yao, Jiayi Dong, Yang Yang, Ju Li, Yilun Du
TL;DR
Roundtable Policy addresses the challenge of aggregating heterogeneous reasoning paths in multi-agent systems for scientific tasks. It introduces a confidence-weighted memory to weigh agent contributions at inference time, enabling auditable consensus without retraining. Empirical results on ScienceEval and ScienceNarrative show notable gains across cross-domain and long-context tasks, with analyses of grader bias and inter-grader agreement. The work argues for reliability modeling and consensus formation as a core paradigm for future multi-agent collaboration.
Abstract
Multi-agent systems have demonstrated exceptional performance in downstream tasks beyond diverse single agent baselines. A growing body of work has explored ways to improve their reasoning and collaboration, from vote, debate, to complex interaction protocols. However, it still remains opaque why specific choice would be preferred in multi-agent systems. Inspired by the decision-making mechanism of democratic committees and The Society of Mind, we introduce Roundtable Policy, an inference-time reasoning framework for multi-agent systems that performs inference through the weighted consensus of multiple LLMs. Through extensive experiments, we demonstrate its that this approach significantly enhances reasoning in complex heterogeneous scientific tasks. Roundtable Policy emphasizes structured and interpretable inference rather than opaque convergence, while requires only black-box access and uniform procedures, making it broadly applicable to diverse multi-agent systems.
