Correcting misinformation on social media with a large language model
Xinyi Zhou, Ashish Sharma, Amy X. Zhang, Tim Althoff
TL;DR
Addressing misinformation on social media, the paper presents Muse, a retrieval-augmented, multimodal system that identifies inaccurate parts of content and explains why with grounded references. Muse comprises three components: a response generator built on an LLM, a hierarchical credibility-aware web retriever, and a multimodal integrator that converts images into text descriptions for evidence retrieval. In expert evaluations across 464 posts, Muse's overall response quality averaged 8.1/10, outperforming GPT-4 by 37% and high-helpfulness layperson responses by 29%. End-user perception study (n=988) shows Muse corrections raise the correct belief that misinformation is misleading by 9.8%, with a per-post cost around $0.5 at the time, and the approach generalizes across modalities, domains, and political leanings; limitations include no video input, English-only assessment on X Community Notes, and reliance on credible-source retrieval.
Abstract
Real-world information, often multimodal, can be misinformed or potentially misleading due to factual errors, outdated claims, missing context, misinterpretation, and more. Such "misinformation" is understudied, challenging to address, and harms many social domains -- particularly on social media, where it can spread rapidly. Manual correction that identifies and explains its (in)accuracies is widely accepted but difficult to scale. While large language models (LLMs) can generate human-like language that could accelerate misinformation correction, they struggle with outdated information, hallucinations, and limited multimodal capabilities. We propose MUSE, an LLM augmented with vision-language modeling and web retrieval over relevant, credible sources to generate responses that determine whether and which part(s) of the given content can be misinformed or potentially misleading, and to explain why with grounded references. We further define a comprehensive set of rubrics to measure response quality, ranging from the accuracy of identifications and factuality of explanations to the relevance and credibility of references. Results show that MUSE consistently produces high-quality outputs across diverse social media content (e.g., modalities, domains, political leanings), including content that has not previously been fact-checked online. Overall, MUSE outperforms GPT-4 by 37% and even high-quality responses from social media users by 29%. Our work provides a general methodological and evaluative framework for correcting misinformation at scale.
