Automated Fact-Checking of Climate Change Claims with Large Language Models

Markus Leippold; Saeid Ashraf Vaghefi; Dominik Stammbach; Veruska Muccione; Julia Bingler; Jingwei Ni; Chiara Colesanti-Senni; Tobias Wekhof; Tobias Schimanski; Glen Gostlow; Tingyu Yu; Juerg Luterbacher; Christian Huggel

Automated Fact-Checking of Climate Change Claims with Large Language Models

Markus Leippold, Saeid Ashraf Vaghefi, Dominik Stammbach, Veruska Muccione, Julia Bingler, Jingwei Ni, Chiara Colesanti-Senni, Tobias Wekhof, Tobias Schimanski, Glen Gostlow, Tingyu Yu, Juerg Luterbacher, Christian Huggel

TL;DR

Climinator tackles the urgent need for automated, evidence-based fact-checking of climate claims by deploying a Mediator-Advocate framework that aggregates multiple LLMs anchored to authoritative sources such as the IPCC and WMO, with optional adversarial perspectives. Claims are decomposed and evaluated by specialized advocates, and a Mediator synthesizes these inputs into a final verdict, iterating through follow-up questions when disagreements arise. Empirical results across Climate Feedback, Skeptical Science, and NIPCC-derived claims show that Climinator and its enhanced variant outperform single-model baselines, particularly in multi-class and nuanced classifications, and display robustness to adversarial prompts. The study also analyzes the handling of NEIs, the effects of including denialist advocates, and the limits of recency and source diversity, outlining future directions toward open-source infrastructure, multi-modal inputs, and rigorous output evaluation for broader applications in climate policy and governance.

Abstract

This paper presents Climinator, a novel AI-based tool designed to automate the fact-checking of climate change claims. Utilizing an array of Large Language Models (LLMs) informed by authoritative sources like the IPCC reports and peer-reviewed scientific literature, Climinator employs an innovative Mediator-Advocate framework. This design allows Climinator to effectively synthesize varying scientific perspectives, leading to robust, evidence-based evaluations. Our model demonstrates remarkable accuracy when testing claims collected from Climate Feedback and Skeptical Science. Notably, when integrating an advocate with a climate science denial perspective in our framework, Climinator's iterative debate process reliably converges towards scientific consensus, underscoring its adeptness at reconciling diverse viewpoints into science-based, factual conclusions. While our research is subject to certain limitations and necessitates careful interpretation, our approach holds significant potential. We hope to stimulate further research and encourage exploring its applicability in other contexts, including political fact-checking and legal domains.

Automated Fact-Checking of Climate Change Claims with Large Language Models

TL;DR

Abstract

Paper Structure (44 sections, 7 figures, 5 tables)

This paper contains 44 sections, 7 figures, 5 tables.

Consolidating the verdicts.
Performance analysis.
Analyzing diverging verdicts.
Obtaining claim verdicts.
Explaining the fraction of NEIs.
Performance analysis.
Absence of follow-up questions.
Analyzing the impact of the NIPCC Advocate.
Analyzing NIPCC's executive summary.
Recency of information.
Source quality and comprehensiveness.
Technical infrastructure.
Multi-modal debating system.
Output Evaluation.
Data Availability
...and 29 more sections

Figures (7)

Figure 1: Climinator: An LLM-based framework within a Mediator-Advocate system to assess the veracity of climate-related claims.
Figure 2: Distribution of claim verdicts for Climate Feedback, annotated by climate experts. The red bars indicate the verdicts that end up as "incorrect", while the green bars end up as "correct".
Figure 3: Three levels of category consolidation, starting from the original twelve Climate Feedback verdicts.
Figure 4: The ratio of 'Not Enough Information' (NEI) for different models. The model generates a NEI, if it cannot access the information needed to provide a verdict on the claim taken from Climate Feedback.
Figure 5: Distribution of claim verdicts for Skeptical Science. The red bars indicate the verdicts that end up as "incorrect", while the green bars end up as "correct" (see Figure \ref{['fig:cons']}).
...and 2 more figures

Automated Fact-Checking of Climate Change Claims with Large Language Models

TL;DR

Abstract

Automated Fact-Checking of Climate Change Claims with Large Language Models

Authors

TL;DR

Abstract

Table of Contents

Figures (7)