Robust Claim Verification Through Fact Detection
Nazanin Jafari, James Allan
TL;DR
This work tackles robustness and reasoning in scientific claim verification by introducing FactDetect, which distills concise, evidence-derived facts from supporting documents. FactDetect uses a three-stage short-fact generation process and a weak-labeling scheme to produce facts that are then integrated via multitask learning or augmented prompting (AugFactDetect) for zero-shot verification, yielding consistent gains on multiple datasets. The approach achieves an average improvement of about 3.0% in supervised F1 and up to 28.1% in zero-shot F1 across SciFact, SciFact-Open, and HealthVer, while also enhancing explainability through explicit fact-level reasoning. Beyond claim verification, FactDetect demonstrates potential for broader factuality evaluation of LLM outputs and other fact-checking applications, albeit with limitations tied to the quality of the underlying generative models and prompting overhead.
Abstract
Claim verification can be a challenging task. In this paper, we present a method to enhance the robustness and reasoning capabilities of automated claim verification through the extraction of short facts from evidence. Our novel approach, FactDetect, leverages Large Language Models (LLMs) to generate concise factual statements from evidence and label these facts based on their semantic relevance to the claim and evidence. The generated facts are then combined with the claim and evidence. To train a lightweight supervised model, we incorporate a fact-detection task into the claim verification process as a multitasking approach to improve both performance and explainability. We also show that augmenting FactDetect in the claim verification prompt enhances performance in zero-shot claim verification using LLMs. Our method demonstrates competitive results in the supervised claim verification model by 15% on the F1 score when evaluated for challenging scientific claim verification datasets. We also demonstrate that FactDetect can be augmented with claim and evidence for zero-shot prompting (AugFactDetect) in LLMs for verdict prediction. We show that AugFactDetect outperforms the baseline with statistical significance on three challenging scientific claim verification datasets with an average of 17.3% performance gain compared to the best performing baselines.
