Table of Contents
Fetching ...

Mining the Explainability and Generalization: Fact Verification Based on Self-Instruction

Guangyao Lu, Yulin Liu

TL;DR

This work tackles explainable fact-checking with open-source LLMs by introducing a self-instruction based fine-tuning framework that combines 2D data augmentation and improved DPO fine-tuning. Using Llama2-7B, it jointly models claim veracity and explanations on FEVEROUS and HOVER, achieving competitive accuracy with a fraction of trainable parameters and fluent explanations. The approach leverages counterfactual data and a difficulty-based sampling strategy to enhance generalization across challenging, multi-hop tasks, while addressing data privacy concerns. It demonstrates the practicality of privacy-preserving fact-checking in real-world settings, though it acknowledges limitations in suppressing hallucinations and multi-hop explanation quality, suggesting future distillation and open-world extensions.

Abstract

Fact-checking based on commercial LLMs has become mainstream. Although these methods offer high explainability, it falls short in accuracy compared to traditional fine-tuning approaches, and data security is also a significant concern. In this paper, we propose a self-instruction based fine-tuning approach for fact-checking that balances accuracy and explainability. Our method consists of Data Augmentation and Improved DPO fine-tuning. The former starts by instructing the model to generate both positive and negative explanations based on claim-evidence pairs and labels, then sampling the dataset according to our customized difficulty standards. The latter employs our proposed improved DPO to fine-tune the model using the generated samples. We fine-tune the smallest-scale LLaMA-7B model and evaluate it on the challenging fact-checking datasets FEVEROUS and HOVER, utilizing four fine-tuning methods and three few-shot learning methods for comparison. The experiments demonstrate that our approach not only retains accuracy comparable to, or even surpassing, traditional fine-tuning methods, but also generates fluent explanation text. Moreover, it also exhibit high generalization performance. Our method is the first to leverage self-supervised learning for fact-checking and innovatively combines contrastive learning and improved DPO in fine-tuning LLMs, as shown in the experiments.

Mining the Explainability and Generalization: Fact Verification Based on Self-Instruction

TL;DR

This work tackles explainable fact-checking with open-source LLMs by introducing a self-instruction based fine-tuning framework that combines 2D data augmentation and improved DPO fine-tuning. Using Llama2-7B, it jointly models claim veracity and explanations on FEVEROUS and HOVER, achieving competitive accuracy with a fraction of trainable parameters and fluent explanations. The approach leverages counterfactual data and a difficulty-based sampling strategy to enhance generalization across challenging, multi-hop tasks, while addressing data privacy concerns. It demonstrates the practicality of privacy-preserving fact-checking in real-world settings, though it acknowledges limitations in suppressing hallucinations and multi-hop explanation quality, suggesting future distillation and open-world extensions.

Abstract

Fact-checking based on commercial LLMs has become mainstream. Although these methods offer high explainability, it falls short in accuracy compared to traditional fine-tuning approaches, and data security is also a significant concern. In this paper, we propose a self-instruction based fine-tuning approach for fact-checking that balances accuracy and explainability. Our method consists of Data Augmentation and Improved DPO fine-tuning. The former starts by instructing the model to generate both positive and negative explanations based on claim-evidence pairs and labels, then sampling the dataset according to our customized difficulty standards. The latter employs our proposed improved DPO to fine-tune the model using the generated samples. We fine-tune the smallest-scale LLaMA-7B model and evaluate it on the challenging fact-checking datasets FEVEROUS and HOVER, utilizing four fine-tuning methods and three few-shot learning methods for comparison. The experiments demonstrate that our approach not only retains accuracy comparable to, or even surpassing, traditional fine-tuning methods, but also generates fluent explanation text. Moreover, it also exhibit high generalization performance. Our method is the first to leverage self-supervised learning for fact-checking and innovatively combines contrastive learning and improved DPO in fine-tuning LLMs, as shown in the experiments.
Paper Structure (27 sections, 10 equations, 5 figures, 6 tables)

This paper contains 27 sections, 10 equations, 5 figures, 6 tables.

Figures (5)

  • Figure 1: Our approach consists of two parts: data augmentation and DPO based finetuning. The former expand the data from both the data dimension and the label dimension. The latter involves fine-tuning the model using PEFT methods, such as Lora, as well as the Improved DPO Algorithm.
  • Figure 2: The variation of $\mu$ on the FEVEROUS and HOVER during training. The vertical axis represents the scalar value of $\mu$, while the horizontal axis represents the training steps.
  • Figure 3: The length of the generated explaination and the count of reasonable explanations. The horizontal axis represents the data type, and the vertical axis represents the text length(\ref{['fig:text_length']}) and the count of reasonable explanations( \ref{['fig:explaination_counter']}).
  • Figure 4: Two examples demonstrating how to generate prompts based on function $I$, with the first one coming from the counterfactual subset. Black font is used to combine information into coherent questions. Gray font represents $c$, $k$, and $tips$. Red font represents the label. Blue font indicates the explanation information generated by the model based on the question and label.
  • Figure 6: The variation of text generation probabilities during the training process on the Anthropic, FEVEROUS, and HOVER validation datasets. The horizontal axis represents the training step, while the vertical axis represents the logarithm of the generation probability(logp/s). The terms "Chosen" and "Rejected" correspond to the positive and negative examples of self-instruction, respectively. We reported the results of three selected settings(-lag,-adv,baic) from the ablation experiments.