Table of Contents
Fetching ...

Take It Easy: Label-Adaptive Self-Rationalization for Fact Verification and Explanation Generation

Jing Yang, Anderson Rocha

TL;DR

A label-adaptive learning approach that improves veracity prediction on both the PubHealth and AVeriTec datasets and performs comparably to the fully fine-tuned self-rationalization model, demonstrating the potential of low-budget learning with synthetic data.

Abstract

Computational methods to aid journalists in the task often require adapting a model to specific domains and generating explanations. However, most automated fact-checking methods rely on three-class datasets, which do not accurately reflect real-world misinformation. Moreover, fact-checking explanations are often generated based on text summarization of evidence, failing to address the relationship between the claim and the evidence. To address these issues, we extend the self-rationalization method--typically used in natural language inference (NLI) tasks--to fact verification. We propose a label-adaptive learning approach: first, we fine-tune a model to learn veracity prediction with annotated labels (step-1 model). Then, we fine-tune the step-1 model again to learn self-rationalization, using the same data and additional annotated explanations. Our results show that our label-adaptive approach improves veracity prediction by more than ten percentage points (Macro F1) on both the PubHealth and AVeriTec datasets, outperforming the GPT-4 model. Furthermore, to address the high cost of explanation annotation, we generated 64 synthetic explanations from three large language models: GPT-4-turbo, GPT-3.5-turbo, and Llama-3-8B and few-shot fine-tune our step-1 model. The few-shot synthetic explanation fine-tuned model performed comparably to the fully fine-tuned self-rationalization model, demonstrating the potential of low-budget learning with synthetic data. Our label-adaptive self-rationalization approach presents a promising direction for future research on real-world explainable fact-checking with different labeling schemes.

Take It Easy: Label-Adaptive Self-Rationalization for Fact Verification and Explanation Generation

TL;DR

A label-adaptive learning approach that improves veracity prediction on both the PubHealth and AVeriTec datasets and performs comparably to the fully fine-tuned self-rationalization model, demonstrating the potential of low-budget learning with synthetic data.

Abstract

Computational methods to aid journalists in the task often require adapting a model to specific domains and generating explanations. However, most automated fact-checking methods rely on three-class datasets, which do not accurately reflect real-world misinformation. Moreover, fact-checking explanations are often generated based on text summarization of evidence, failing to address the relationship between the claim and the evidence. To address these issues, we extend the self-rationalization method--typically used in natural language inference (NLI) tasks--to fact verification. We propose a label-adaptive learning approach: first, we fine-tune a model to learn veracity prediction with annotated labels (step-1 model). Then, we fine-tune the step-1 model again to learn self-rationalization, using the same data and additional annotated explanations. Our results show that our label-adaptive approach improves veracity prediction by more than ten percentage points (Macro F1) on both the PubHealth and AVeriTec datasets, outperforming the GPT-4 model. Furthermore, to address the high cost of explanation annotation, we generated 64 synthetic explanations from three large language models: GPT-4-turbo, GPT-3.5-turbo, and Llama-3-8B and few-shot fine-tune our step-1 model. The few-shot synthetic explanation fine-tuned model performed comparably to the fully fine-tuned self-rationalization model, demonstrating the potential of low-budget learning with synthetic data. Our label-adaptive self-rationalization approach presents a promising direction for future research on real-world explainable fact-checking with different labeling schemes.
Paper Structure (18 sections, 2 figures, 7 tables)

This paper contains 18 sections, 2 figures, 7 tables.

Figures (2)

  • Figure 1: Models' performance on the AVeriTec dataset for each class (F1 score). 0-shot: zero-shot performance on T5-3B; Self-Rationalization: fine-tuned T5-3B model on joint labels and explanations. Ours: Label-adaptive Self-rationalization.
  • Figure 2: Label-adaptive self-rationalization 2-step pipeline. In step-1, the model learns veracity prediction with only provided labels; in Step-2, the model learns the self-rationalization task with both labels and explanations.