Table of Contents
Fetching ...

Beyond Detection: Exploring Evidence-based Multi-Agent Debate for Misinformation Intervention and Persuasion

Chen Han, Yijia Ma, Jin Tan, Wenzhen Zheng, Xijin Tang

TL;DR

This work tackles misinformation by introducing ED2D, an evidence-based multi-agent debate framework that integrates external factual retrieval into adversarial reasoning and uses persuasive debunking to influence user beliefs. ED2D combines a five-stage MAD with a Wikipedia-based evidence pipeline to ground arguments and produce transparent, debatable explanations, achieving strong detection performance and interpretability across three benchmarks, including the real-world Snopes25 dataset. A controlled human-subject study demonstrates that ED2D can be as persuasive as expert fact-checks when correct, but also reveals risks where incorrect AI outputs can mislead users, underscoring the need for safeguards. A public platform accompanies ED2D to promote transparency, epistemic vigilance, and collaborative fact-checking, highlighting both practical utility and safety considerations for deploying persuasive AI in misinformation intervention.

Abstract

Multi-agent debate (MAD) frameworks have emerged as promising approaches for misinformation detection by simulating adversarial reasoning. While prior work has focused on detection accuracy, it overlooks the importance of helping users understand the reasoning behind factual judgments and develop future resilience. The debate transcripts generated during MAD offer a rich but underutilized resource for transparent reasoning. In this study, we introduce ED2D, an evidence-based MAD framework that extends previous approach by incorporating factual evidence retrieval. More importantly, ED2D is designed not only as a detection framework but also as a persuasive multi-agent system aimed at correcting user beliefs and discouraging misinformation sharing. We compare the persuasive effects of ED2D-generated debunking transcripts with those authored by human experts. Results demonstrate that ED2D outperforms existing baselines across three misinformation detection benchmarks. When ED2D generates correct predictions, its debunking transcripts exhibit persuasive effects comparable to those of human experts; However, when ED2D misclassifies, its accompanying explanations may inadvertently reinforce users'misconceptions, even when presented alongside accurate human explanations. Our findings highlight both the promise and the potential risks of deploying MAD systems for misinformation intervention. We further develop a public community website to help users explore ED2D, fostering transparency, critical thinking, and collaborative fact-checking.

Beyond Detection: Exploring Evidence-based Multi-Agent Debate for Misinformation Intervention and Persuasion

TL;DR

This work tackles misinformation by introducing ED2D, an evidence-based multi-agent debate framework that integrates external factual retrieval into adversarial reasoning and uses persuasive debunking to influence user beliefs. ED2D combines a five-stage MAD with a Wikipedia-based evidence pipeline to ground arguments and produce transparent, debatable explanations, achieving strong detection performance and interpretability across three benchmarks, including the real-world Snopes25 dataset. A controlled human-subject study demonstrates that ED2D can be as persuasive as expert fact-checks when correct, but also reveals risks where incorrect AI outputs can mislead users, underscoring the need for safeguards. A public platform accompanies ED2D to promote transparency, epistemic vigilance, and collaborative fact-checking, highlighting both practical utility and safety considerations for deploying persuasive AI in misinformation intervention.

Abstract

Multi-agent debate (MAD) frameworks have emerged as promising approaches for misinformation detection by simulating adversarial reasoning. While prior work has focused on detection accuracy, it overlooks the importance of helping users understand the reasoning behind factual judgments and develop future resilience. The debate transcripts generated during MAD offer a rich but underutilized resource for transparent reasoning. In this study, we introduce ED2D, an evidence-based MAD framework that extends previous approach by incorporating factual evidence retrieval. More importantly, ED2D is designed not only as a detection framework but also as a persuasive multi-agent system aimed at correcting user beliefs and discouraging misinformation sharing. We compare the persuasive effects of ED2D-generated debunking transcripts with those authored by human experts. Results demonstrate that ED2D outperforms existing baselines across three misinformation detection benchmarks. When ED2D generates correct predictions, its debunking transcripts exhibit persuasive effects comparable to those of human experts; However, when ED2D misclassifies, its accompanying explanations may inadvertently reinforce users'misconceptions, even when presented alongside accurate human explanations. Our findings highlight both the promise and the potential risks of deploying MAD systems for misinformation intervention. We further develop a public community website to help users explore ED2D, fostering transparency, critical thinking, and collaborative fact-checking.

Paper Structure

This paper contains 19 sections, 4 figures, 4 tables.

Figures (4)

  • Figure 1: A debate example on the claim that toilet flushing releases airborne pathogens. The case demonstrates the ED2D reasoning process, with the affirmative side prevailing based on evidence-grounded argumentation.
  • Figure 2: Architecture of the ED2D framework. Given a news claim, LLM agents with domain-specific profiles engage in a structured debate comprising five stages, including Opening, Rebuttal, Free Debate, Closing, and Judgment. During the Free Debate and Judgment, an evidence retrieval module actively retrieves relevant factual information from external sources to support or challenge arguments. All agents share the compressed history memory, enabling coherent multi-turn interactions.
  • Figure 3: A demonstration of the ED2D community website.
  • Figure 4: Accuracy comparison by topical domain and claim timeliness. Subfigures (a) and (b) show domain-level accuracy under RQ2 and RQ3 across four domains: Science&Environment, Politics&Government, Medicine&Culture, and Society&Entertainment. Subfigures (c) and (d) analyze accuracy by timeliness, contrasting Current versus Non-Current claims.