Table of Contents
Fetching ...

Application and Optimization of Large Models Based on Prompt Tuning for Fact-Check-Worthiness Estimation

Yinglong Yu, Hao Shen, Zhengyi Lyu, Qi He

TL;DR

This work targets the problem of fact-check-worthiness estimation under misinformation by proposing a prompt-tuning framework that converts the task into an in-context learning problem governed by two prompt templates. By integrating prompt templates, a simple verbalizer, in-context demonstrations, and parameter-efficient prompt tuning, the approach leverages latent knowledge in large language models with a frozen base model. Empirical results on an English COVID-19 tweet dataset show the method achieves competitive F1 and accuracy, surpassing classical baselines and GPT-3.5, and approaching GPT-4–level performance with a substantially smaller parameter count. The study demonstrates the practicality of prompt-tuning for fact-check-worthiness tasks and points to future work on expanding template design, exploring additional tuning methods, and extending evaluation to multilingual data.

Abstract

In response to the growing problem of misinformation in the context of globalization and informatization, this paper proposes a classification method for fact-check-worthiness estimation based on prompt tuning. We construct a model for fact-check-worthiness estimation at the methodological level using prompt tuning. By applying designed prompt templates to large language models, we establish in-context learning and leverage prompt tuning technology to improve the accuracy of determining whether claims have fact-check-worthiness, particularly when dealing with limited or unlabeled data. Through extensive experiments on public datasets, we demonstrate that the proposed method surpasses or matches multiple baseline methods in the classification task of fact-check-worthiness estimation assessment, including classical pre-trained models such as BERT, as well as recent popular large models like GPT-3.5 and GPT-4. Experiments show that the prompt tuning-based method proposed in this study exhibits certain advantages in evaluation metrics such as F1 score and accuracy, thereby effectively validating its effectiveness and advancement in the task of fact-check-worthiness estimation.

Application and Optimization of Large Models Based on Prompt Tuning for Fact-Check-Worthiness Estimation

TL;DR

This work targets the problem of fact-check-worthiness estimation under misinformation by proposing a prompt-tuning framework that converts the task into an in-context learning problem governed by two prompt templates. By integrating prompt templates, a simple verbalizer, in-context demonstrations, and parameter-efficient prompt tuning, the approach leverages latent knowledge in large language models with a frozen base model. Empirical results on an English COVID-19 tweet dataset show the method achieves competitive F1 and accuracy, surpassing classical baselines and GPT-3.5, and approaching GPT-4–level performance with a substantially smaller parameter count. The study demonstrates the practicality of prompt-tuning for fact-check-worthiness tasks and points to future work on expanding template design, exploring additional tuning methods, and extending evaluation to multilingual data.

Abstract

In response to the growing problem of misinformation in the context of globalization and informatization, this paper proposes a classification method for fact-check-worthiness estimation based on prompt tuning. We construct a model for fact-check-worthiness estimation at the methodological level using prompt tuning. By applying designed prompt templates to large language models, we establish in-context learning and leverage prompt tuning technology to improve the accuracy of determining whether claims have fact-check-worthiness, particularly when dealing with limited or unlabeled data. Through extensive experiments on public datasets, we demonstrate that the proposed method surpasses or matches multiple baseline methods in the classification task of fact-check-worthiness estimation assessment, including classical pre-trained models such as BERT, as well as recent popular large models like GPT-3.5 and GPT-4. Experiments show that the prompt tuning-based method proposed in this study exhibits certain advantages in evaluation metrics such as F1 score and accuracy, thereby effectively validating its effectiveness and advancement in the task of fact-check-worthiness estimation.

Paper Structure

This paper contains 23 sections, 3 equations, 6 figures, 7 tables.

Figures (6)

  • Figure 1: Workflow of in-context learning method.
  • Figure 2: Workflow of prompt engineering.
  • Figure 3: Comparison of model parameter sizes and template lengths.
  • Figure 4: Comparison of in-context learning results.
  • Figure 5: Comparison of prompt tuning.
  • ...and 1 more figures