Table of Contents
Fetching ...

MultiFC: A Real-World Multi-Domain Dataset for Evidence-Based Fact Checking of Claims

Isabelle Augenstein, Christina Lioma, Dongsheng Wang, Lucas Chaves Lima, Casper Hansen, Christian Hansen, Jakob Grue Simonsen

TL;DR

MultiFC provides the largest real-world, multi-domain dataset for evidence-based fact checking, containing 34,918 naturally occurring claims across 26 domains with 10 retrieved evidence pages per claim and rich metadata. The authors analyze dataset characteristics and demonstrate that both evidence pages and metadata significantly improve veracity prediction, proposing a novel joint model that ranks evidence and predicts veracity, achieving a Macro F1 of 0.492. The approach uses multitask learning with a shared label embedding layer to handle disparate domain label spaces, and evidence-ranking mechanisms to weight evidence contributions. This dataset and methodology offer a challenging, realistic benchmark for advancing cross-domain fact checking and evidence integration in automated systems.

Abstract

We contribute the largest publicly available dataset of naturally occurring factual claims for the purpose of automatic claim verification. It is collected from 26 fact checking websites in English, paired with textual sources and rich metadata, and labelled for veracity by human expert journalists. We present an in-depth analysis of the dataset, highlighting characteristics and challenges. Further, we present results for automatic veracity prediction, both with established baselines and with a novel method for joint ranking of evidence pages and predicting veracity that outperforms all baselines. Significant performance increases are achieved by encoding evidence, and by modelling metadata. Our best-performing model achieves a Macro F1 of 49.2%, showing that this is a challenging testbed for claim veracity prediction.

MultiFC: A Real-World Multi-Domain Dataset for Evidence-Based Fact Checking of Claims

TL;DR

MultiFC provides the largest real-world, multi-domain dataset for evidence-based fact checking, containing 34,918 naturally occurring claims across 26 domains with 10 retrieved evidence pages per claim and rich metadata. The authors analyze dataset characteristics and demonstrate that both evidence pages and metadata significantly improve veracity prediction, proposing a novel joint model that ranks evidence and predicts veracity, achieving a Macro F1 of 0.492. The approach uses multitask learning with a shared label embedding layer to handle disparate domain label spaces, and evidence-ranking mechanisms to weight evidence contributions. This dataset and methodology offer a challenging, realistic benchmark for advancing cross-domain fact checking and evidence integration in automated systems.

Abstract

We contribute the largest publicly available dataset of naturally occurring factual claims for the purpose of automatic claim verification. It is collected from 26 fact checking websites in English, paired with textual sources and rich metadata, and labelled for veracity by human expert journalists. We present an in-depth analysis of the dataset, highlighting characteristics and challenges. Further, we present results for automatic veracity prediction, both with established baselines and with a novel method for joint ranking of evidence pages and predicting veracity that outperforms all baselines. Significant performance increases are achieved by encoding evidence, and by modelling metadata. Our best-performing model achieves a Macro F1 of 49.2%, showing that this is a challenging testbed for claim veracity prediction.

Paper Structure

This paper contains 24 sections, 7 equations, 3 figures, 11 tables.

Figures (3)

  • Figure 1: Distribution of entities in claims.
  • Figure 2: The Joint Veracity Prediction and Evidence Ranking model, shown for one task.
  • Figure 3: Confusion matrix of predicted labels with best-performing model, crawled_ranked + meta, on the 'pomt' domain