Hear Both Sides: Efficient Multi-Agent Debate via Diversity-Aware Message Retention

Manh Nguyen; Anh Nguyen; Dung Nguyen; Svetha Venkatesh; Hung Le

Hear Both Sides: Efficient Multi-Agent Debate via Diversity-Aware Message Retention

Manh Nguyen, Anh Nguyen, Dung Nguyen, Svetha Venkatesh, Hung Le

Abstract

Multi-Agent Debate has emerged as a promising framework for improving the reasoning quality of large language models through iterative inter-agent communication. However, broadcasting all agent messages at every round introduces noise and redundancy that can degrade debate quality and waste computational resources. Current approaches rely on uncertainty estimation to filter low-confidence responses before broadcasting, but this approach is unreliable due to miscalibrated confidence scores and sensitivity to threshold selection. To address this, we propose Diversity-Aware Retention (DAR), a lightweight debate framework that, at each debate round, selects the subset of agent responses that maximally disagree with each other and with the majority vote before broadcasting. Through an explicit index-based retention mechanism, DAR preserves the original messages without modification, ensuring that retained disagreements remain authentic. Experiments on diverse reasoning and question answering benchmarks demonstrate that our selective message propagation consistently improves debate performance, particularly as the number of agents scales, where noise accumulation is most severe. Our results highlight that what agents hear is as important as what agents say in multi-agent reasoning systems.

Hear Both Sides: Efficient Multi-Agent Debate via Diversity-Aware Message Retention

Abstract

Paper Structure (33 sections, 5 equations, 14 figures, 9 tables, 1 algorithm)

This paper contains 33 sections, 5 equations, 14 figures, 9 tables, 1 algorithm.

Introduction
Related Work
Multi-Agent Debate
Uncertainty and Diversity in Multi-Agent LLM Systems
Method
Preliminaries
Incorporating Uncertainty in Multi-Agent Debate
Majority Vote as Additional Context
Improving MAD by Promoting Diversity
Retaining Criterion
Fallback behavior
Illustrative Case Study
Experiment
Experiment Setup
Models
...and 18 more sections

Figures (14)

Figure 1: Visualization of our diversity-enhanced MAD pipeline (DAR). In standard MAD, each agent receives all peer responses (e.g., A, A) as context, which can introduce redundancy or noisy prompts. Our method introduces the filter module $\mathcal{F}$ to maintain informative diversity across debate rounds, preserving varied reasoning paths and increasing the probability of generating correct answers.
Figure 2: Average performance of all six methods across models, aggregated over seven benchmarks, for different numbers of agents $N$. Each subplot corresponds to a single model, showing how accuracy changes with increasing $N$.
Figure 3: Diversity of retained responses across retention strategies on Qwen2.5-3B. Similar result for Qwen2.5-1.5B is provided in Appendix \ref{['app:diversity']}.
Figure 4: DAR recovers minority-correct answers while standard MAD fails. Example from Arithmetics using Qwen2.5-1.5B; full responses in Appendix \ref{['app:qualitative']}.
Figure 5: Average results on Arithmetics and Form.Log. over debate rounds $R$. $R{=}0$ indicates Majority Vote.
...and 9 more figures

Hear Both Sides: Efficient Multi-Agent Debate via Diversity-Aware Message Retention

Abstract

Hear Both Sides: Efficient Multi-Agent Debate via Diversity-Aware Message Retention

Authors

Abstract

Table of Contents

Figures (14)