Fact-checking information from large language models can decrease headline discernment
Matthew R. DeVerna, Harry Yaojun Yan, Kai-Cheng Yang, Filippo Menczer
TL;DR
This paper investigates how fact-checking information from a large language model (ChatGPT-3.5) influences belief and sharing of political headlines. In a preregistered randomized experiment with belief and sharing arms and four conditions (including human fact checks), the AI-generated checks were accurate for many false headlines but did not improve discernment or sharing of true headlines, and could even degrade accuracy when the AI mislabels true headlines or is unsure about others. In contrast, human-generated fact checks significantly improved both belief and sharing discernment, highlighting potential harms and limitations of deploying AI-only fact-checking at scale. The findings emphasize the need for careful design and policy considerations to mitigate unintended AI-driven effects on information quality in digital ecosystems.
Abstract
Fact checking can be an effective strategy against misinformation, but its implementation at scale is impeded by the overwhelming volume of information online. Recent artificial intelligence (AI) language models have shown impressive ability in fact-checking tasks, but how humans interact with fact-checking information provided by these models is unclear. Here, we investigate the impact of fact-checking information generated by a popular large language model (LLM) on belief in, and sharing intent of, political news headlines in a preregistered randomized control experiment. Although the LLM accurately identifies most false headlines (90%), we find that this information does not significantly improve participants' ability to discern headline accuracy or share accurate news. In contrast, viewing human-generated fact checks enhances discernment in both cases. Subsequent analysis reveals that the AI fact-checker is harmful in specific cases: it decreases beliefs in true headlines that it mislabels as false and increases beliefs in false headlines that it is unsure about. On the positive side, AI fact-checking information increases the sharing intent for correctly labeled true headlines. When participants are given the option to view LLM fact checks and choose to do so, they are significantly more likely to share both true and false news but only more likely to believe false headlines. Our findings highlight an important source of potential harm stemming from AI applications and underscore the critical need for policies to prevent or mitigate such unintended consequences.
