Table of Contents
Fetching ...

Can LLMs Automate Fact-Checking Article Writing?

Dhruv Sahnan, David Corney, Irene Larraz, Giovanni Zagni, Ruben Miguez, Zhuohan Xie, Iryna Gurevych, Elizabeth Churchill, Tanmoy Chakraborty, Preslav Nakov

TL;DR

This work tackles the gap in automatic fact-checking by proposing automatic generation of full fact-checking articles. It introduces QRAFT, a three-agent, two-stage framework that mimics human article writing: Planner gathers evidence, Writer composes, and Editor refines through simulated editorial feedback, with iterative planning and revision. Across automatic metrics and expert judgments, QRAFT outperforms several baselines yet remains below expert-written articles, highlighting ongoing trust and quality gaps for AI-generated journalism. The results motivate further research on integrating domain-specific guidelines, enhancing factual coherence, and ensuring transparent, verifiable citations before AI-generated articles can be used in public communication.

Abstract

Automatic fact-checking aims to support professional fact-checkers by offering tools that can help speed up manual fact-checking. Yet, existing frameworks fail to address the key step of producing output suitable for broader dissemination to the general public: while human fact-checkers communicate their findings through fact-checking articles, automated systems typically produce little or no justification for their assessments. Here, we aim to bridge this gap. We argue for the need to extend the typical automatic fact-checking pipeline with automatic generation of full fact-checking articles. We first identify key desiderata for such articles through a series of interviews with experts from leading fact-checking organizations. We then develop QRAFT, an LLM-based agentic framework that mimics the writing workflow of human fact-checkers. Finally, we assess the practical usefulness of QRAFT through human evaluations with professional fact-checkers. Our evaluation shows that while QRAFT outperforms several previously proposed text-generation approaches, it lags considerably behind expert-written articles. We hope that our work will enable further research in this new and important direction.

Can LLMs Automate Fact-Checking Article Writing?

TL;DR

This work tackles the gap in automatic fact-checking by proposing automatic generation of full fact-checking articles. It introduces QRAFT, a three-agent, two-stage framework that mimics human article writing: Planner gathers evidence, Writer composes, and Editor refines through simulated editorial feedback, with iterative planning and revision. Across automatic metrics and expert judgments, QRAFT outperforms several baselines yet remains below expert-written articles, highlighting ongoing trust and quality gaps for AI-generated journalism. The results motivate further research on integrating domain-specific guidelines, enhancing factual coherence, and ensuring transparent, verifiable citations before AI-generated articles can be used in public communication.

Abstract

Automatic fact-checking aims to support professional fact-checkers by offering tools that can help speed up manual fact-checking. Yet, existing frameworks fail to address the key step of producing output suitable for broader dissemination to the general public: while human fact-checkers communicate their findings through fact-checking articles, automated systems typically produce little or no justification for their assessments. Here, we aim to bridge this gap. We argue for the need to extend the typical automatic fact-checking pipeline with automatic generation of full fact-checking articles. We first identify key desiderata for such articles through a series of interviews with experts from leading fact-checking organizations. We then develop QRAFT, an LLM-based agentic framework that mimics the writing workflow of human fact-checkers. Finally, we assess the practical usefulness of QRAFT through human evaluations with professional fact-checkers. Our evaluation shows that while QRAFT outperforms several previously proposed text-generation approaches, it lags considerably behind expert-written articles. We hope that our work will enable further research in this new and important direction.

Paper Structure

This paper contains 40 sections, 1 equation, 4 figures, 9 tables, 1 algorithm.

Figures (4)

  • Figure 1: Our proposed pipeline for automatic fact-checking. We extend the typical steps in the automatic claim fact-checking pipeline to include a new task: fact-checking article generation.
  • Figure 2: Workflow of Qraft: stage (a) -- planning and compiling the first draft. We use two agents, Planner and Writer, and we split this stage into three steps: (i) Gathering Evidence Nuggets, (ii) Setting the Preferences, and (iii) Draft Planning and Iterative Rewriting. See \ref{['sec:methodology']} for more detail.
  • Figure 3: Workflow of Qraft: stage (b) -- simulating an editorial review. We simulate conversational question-answering between the Editor and the Writer in order to generate feedback on how to improve the first draft from Qraft's stage (a).
  • Figure 4: Distibution of rankings received by the four fact-check articles based on relative publishability as perceived by the experts.