DELL: Generating Reactions and Explanations for LLM-Based Misinformation Detection
Herun Wan, Shangbin Feng, Zhaoxuan Tan, Heng Wang, Yulia Tsvetkov, Minnan Luo
TL;DR
DELL tackles misinformation detection by integrating LLMs at three stages: generating diverse user reactions to ground articles, producing explainable proxy-task explanations to enrich article contexts, and using LLM-guided selective ensemble of task-specific experts for calibrated predictions. The approach is validated across seven datasets and three LLMs, achieving state-of-the-art macro F1-scores with gains up to 16.8% and improved calibration. Key findings show that LLM-generated reactions ground articles effectively, proxy-task explanations enrich representations for better detection, and expert merging with confidence signals yields well-calibrated decisions. Overall, DELL demonstrates that carefully structured LLM integration—grounded reactions, explainable tasks, and selective ensembling—can deliver robust and scalable misinformation detectors.
Abstract
Large language models are limited by challenges in factuality and hallucinations to be directly employed off-the-shelf for judging the veracity of news articles, where factual accuracy is paramount. In this work, we propose DELL that identifies three key stages in misinformation detection where LLMs could be incorporated as part of the pipeline: 1) LLMs could \emph{generate news reactions} to represent diverse perspectives and simulate user-news interaction networks; 2) LLMs could \emph{generate explanations} for proxy tasks (e.g., sentiment, stance) to enrich the contexts of news articles and produce experts specializing in various aspects of news understanding; 3) LLMs could \emph{merge task-specific experts} and provide an overall prediction by incorporating the predictions and confidence scores of varying experts. Extensive experiments on seven datasets with three LLMs demonstrate that DELL outperforms state-of-the-art baselines by up to 16.8\% in macro f1-score. Further analysis reveals that the generated reactions and explanations are greatly helpful in misinformation detection, while our proposed LLM-guided expert merging helps produce better-calibrated predictions.
