AFaCTA: Assisting the Annotation of Factual Claim Detection with Reliable LLM Annotators
Jingwei Ni, Minjing Shi, Dominik Stammbach, Mrinmaya Sachan, Elliott Ash, Markus Leippold
TL;DR
This work tackles two key barriers in factual claim detection—conceptual inconsistency and costly annotation—by introducing a verifiability-based definition of factual claims and AFaCTA, an LLM-assisted annotation framework. AFaCTA uses three prompting steps (Direct Classification, Fact-Extraction CoT, and Reasoning with Debate) followed by a majority-vote aggregation to calibrate reliability via self-consistency. Evaluated on PoliClaim, a 25-year corpus of U.S. political speeches, GPT-4 AFaCTA achieves near-expert accuracy on perfectly consistent samples and can auto-label a substantial portion of data, enabling effective classifier training and data augmentation; results generalize to a social-media domain (CheckThat!-2021-dev). The findings demonstrate that high-quality, self-consistent LLM annotations can substitute for manual labeling in scalable fact-checking work, with practical implications for building large, reliable claim-detection resources and cross-domain applicability.
Abstract
With the rise of generative AI, automated fact-checking methods to combat misinformation are becoming more and more important. However, factual claim detection, the first step in a fact-checking pipeline, suffers from two key issues that limit its scalability and generalizability: (1) inconsistency in definitions of the task and what a claim is, and (2) the high cost of manual annotation. To address (1), we review the definitions in related work and propose a unifying definition of factual claims that focuses on verifiability. To address (2), we introduce AFaCTA (Automatic Factual Claim deTection Annotator), a novel framework that assists in the annotation of factual claims with the help of large language models (LLMs). AFaCTA calibrates its annotation confidence with consistency along three predefined reasoning paths. Extensive evaluation and experiments in the domain of political speech reveal that AFaCTA can efficiently assist experts in annotating factual claims and training high-quality classifiers, and can work with or without expert supervision. Our analyses also result in PoliClaim, a comprehensive claim detection dataset spanning diverse political topics.
