Automated Fact-Checking of Climate Change Claims with Large Language Models
Markus Leippold, Saeid Ashraf Vaghefi, Dominik Stammbach, Veruska Muccione, Julia Bingler, Jingwei Ni, Chiara Colesanti-Senni, Tobias Wekhof, Tobias Schimanski, Glen Gostlow, Tingyu Yu, Juerg Luterbacher, Christian Huggel
TL;DR
Climinator tackles the urgent need for automated, evidence-based fact-checking of climate claims by deploying a Mediator-Advocate framework that aggregates multiple LLMs anchored to authoritative sources such as the IPCC and WMO, with optional adversarial perspectives. Claims are decomposed and evaluated by specialized advocates, and a Mediator synthesizes these inputs into a final verdict, iterating through follow-up questions when disagreements arise. Empirical results across Climate Feedback, Skeptical Science, and NIPCC-derived claims show that Climinator and its enhanced variant outperform single-model baselines, particularly in multi-class and nuanced classifications, and display robustness to adversarial prompts. The study also analyzes the handling of NEIs, the effects of including denialist advocates, and the limits of recency and source diversity, outlining future directions toward open-source infrastructure, multi-modal inputs, and rigorous output evaluation for broader applications in climate policy and governance.
Abstract
This paper presents Climinator, a novel AI-based tool designed to automate the fact-checking of climate change claims. Utilizing an array of Large Language Models (LLMs) informed by authoritative sources like the IPCC reports and peer-reviewed scientific literature, Climinator employs an innovative Mediator-Advocate framework. This design allows Climinator to effectively synthesize varying scientific perspectives, leading to robust, evidence-based evaluations. Our model demonstrates remarkable accuracy when testing claims collected from Climate Feedback and Skeptical Science. Notably, when integrating an advocate with a climate science denial perspective in our framework, Climinator's iterative debate process reliably converges towards scientific consensus, underscoring its adeptness at reconciling diverse viewpoints into science-based, factual conclusions. While our research is subject to certain limitations and necessitates careful interpretation, our approach holds significant potential. We hope to stimulate further research and encourage exploring its applicability in other contexts, including political fact-checking and legal domains.
