Table of Contents
Fetching ...

Towards LLM-based Fact Verification on News Claims with a Hierarchical Step-by-Step Prompting Method

Xuan Zhang, Wei Gao

TL;DR

This paper tackles the challenge of verifying real-world news claims using large language models in a few-shot setting. It introduces Hierarchical Step-by-Step (HiSS) prompting, which decomposes claims into subclaims and verifies them through progressive questions, aided by web search when needed. HiSS demonstrates superior performance over strong fully supervised models and other few-shot baselines on RAWFC and LIAR, while also offering more fine-grained, human-interpretable explanations. The work highlights the potential of structured, evidence-grounded LLM reasoning for scalable, explainable fact verification and points to future directions in conversational, interactive fact-checking with human-in-the-loop systems.

Abstract

While large pre-trained language models (LLMs) have shown their impressive capabilities in various NLP tasks, they are still under-explored in the misinformation domain. In this paper, we examine LLMs with in-context learning (ICL) for news claim verification, and find that only with 4-shot demonstration examples, the performance of several prompting methods can be comparable with previous supervised models. To further boost performance, we introduce a Hierarchical Step-by-Step (HiSS) prompting method which directs LLMs to separate a claim into several subclaims and then verify each of them via multiple questions-answering steps progressively. Experiment results on two public misinformation datasets show that HiSS prompting outperforms state-of-the-art fully-supervised approach and strong few-shot ICL-enabled baselines.

Towards LLM-based Fact Verification on News Claims with a Hierarchical Step-by-Step Prompting Method

TL;DR

This paper tackles the challenge of verifying real-world news claims using large language models in a few-shot setting. It introduces Hierarchical Step-by-Step (HiSS) prompting, which decomposes claims into subclaims and verifies them through progressive questions, aided by web search when needed. HiSS demonstrates superior performance over strong fully supervised models and other few-shot baselines on RAWFC and LIAR, while also offering more fine-grained, human-interpretable explanations. The work highlights the potential of structured, evidence-grounded LLM reasoning for scalable, explainable fact verification and points to future directions in conversational, interactive fact-checking with human-in-the-loop systems.

Abstract

While large pre-trained language models (LLMs) have shown their impressive capabilities in various NLP tasks, they are still under-explored in the misinformation domain. In this paper, we examine LLMs with in-context learning (ICL) for news claim verification, and find that only with 4-shot demonstration examples, the performance of several prompting methods can be comparable with previous supervised models. To further boost performance, we introduce a Hierarchical Step-by-Step (HiSS) prompting method which directs LLMs to separate a claim into several subclaims and then verify each of them via multiple questions-answering steps progressively. Experiment results on two public misinformation datasets show that HiSS prompting outperforms state-of-the-art fully-supervised approach and strong few-shot ICL-enabled baselines.
Paper Structure (24 sections, 3 figures, 9 tables)

This paper contains 24 sections, 3 figures, 9 tables.

Figures (3)

  • Figure 1: An example of claim verification based on vanilla CoT prompting. The claim (underlined) and CoT (in green) are given as a demonstration. The generated CoT (in italics) leads to an incorrect judgment due to (1) omission of necessary thoughts regarding "nukes", and (2) fact hallucination about the war-loving speeches without specific evidence in the generated CoT (in blue).
  • Figure 2: Overview of the proposed HiSS model: Original human inputs are in red background, LLM directly generated text is in white, and answers generated based on search results are in green. We start by providing a few-shot demonstration, followed by appending the claim to be checked (underlined). HiSS prompts the LLM to (1) decompose the claim into subclaims; (2) verify each subclaim step-by-step via raising and answering a series of questions. For each question, we prompt LLM to assess if it is confident to answer it or not, and if not, we input the question to a web search engine. The search results are then inserted back into the ongoing prompt to continue the verification process; (3) generate the final prediction. The detailed demonstrations are omitted in this illustration for space which can be found in Table \ref{['tbl:hiss']} and Table \ref{['tbl:hiss_2']} in Appendix \ref{['app:prompt']} .
  • Figure 3: Ablation results on RAWFC dataset.