Table of Contents
Fetching ...

Explainable Automated Fact-Checking for Public Health Claims

Neema Kotonya, Francesca Toni

TL;DR

This paper addresses the need for explainable automated fact-checking in domains requiring specialized expertise, focusing on public health. It introduces PubHealth, a dataset of about $11.8K$ health-related claims with journalist-provided explanations, and develops a two-task framework for veracity prediction and explanation generation that relies on domain-adapted language models. By training on in-domain data, the approach improves both veracity accuracy and explanation quality, demonstrated through a joint extractive-abstractive explainer and ROUGE-based evaluation, supplemented by coherence-based human and NLI assessments. The work provides a framework for evaluating explanation quality via global and local coherence properties and suggests its approach can be extended to other expert domains to enhance trust and usability in automated fact-checking.

Abstract

Fact-checking is the task of verifying the veracity of claims by assessing their assertions against credible evidence. The vast majority of fact-checking studies focus exclusively on political claims. Very little research explores fact-checking for other topics, specifically subject matters for which expertise is required. We present the first study of explainable fact-checking for claims which require specific expertise. For our case study we choose the setting of public health. To support this case study we construct a new dataset PUBHEALTH of 11.8K claims accompanied by journalist crafted, gold standard explanations (i.e., judgments) to support the fact-check labels for claims. We explore two tasks: veracity prediction and explanation generation. We also define and evaluate, with humans and computationally, three coherence properties of explanation quality. Our results indicate that, by training on in-domain data, gains can be made in explainable, automated fact-checking for claims which require specific expertise.

Explainable Automated Fact-Checking for Public Health Claims

TL;DR

This paper addresses the need for explainable automated fact-checking in domains requiring specialized expertise, focusing on public health. It introduces PubHealth, a dataset of about health-related claims with journalist-provided explanations, and develops a two-task framework for veracity prediction and explanation generation that relies on domain-adapted language models. By training on in-domain data, the approach improves both veracity accuracy and explanation quality, demonstrated through a joint extractive-abstractive explainer and ROUGE-based evaluation, supplemented by coherence-based human and NLI assessments. The work provides a framework for evaluating explanation quality via global and local coherence properties and suggests its approach can be extended to other expert domains to enhance trust and usability in automated fact-checking.

Abstract

Fact-checking is the task of verifying the veracity of claims by assessing their assertions against credible evidence. The vast majority of fact-checking studies focus exclusively on political claims. Very little research explores fact-checking for other topics, specifically subject matters for which expertise is required. We present the first study of explainable fact-checking for claims which require specific expertise. For our case study we choose the setting of public health. To support this case study we construct a new dataset PUBHEALTH of 11.8K claims accompanied by journalist crafted, gold standard explanations (i.e., judgments) to support the fact-check labels for claims. We explore two tasks: veracity prediction and explanation generation. We also define and evaluate, with humans and computationally, three coherence properties of explanation quality. Our results indicate that, by training on in-domain data, gains can be made in explainable, automated fact-checking for claims which require specific expertise.

Paper Structure

This paper contains 29 sections, 3 equations, 7 figures, 9 tables.

Figures (7)

  • Figure 1: Architecture of veracity prediction.
  • Figure 2: Example of model-generated explanations as compared to the gold standard from our fact-checking dataset.
  • Figure 3: Example of explanation which satisfies all three coherence properties.
  • Figure 4: Vocabulary from the health lexicon which features $>$ 300 times in PubHealth article texts.
  • Figure 5: Histograms showing the distribution of lengths, measured by the number of tokens, for claims and explanations in the PubHealth dataset.
  • ...and 2 more figures