Table of Contents
Fetching ...

Minimal Evidence Group Identification for Claim Verification

Xiangci Li, Sihao Chen, Rajvi Kapadia, Jessica Ouyang, Fan Zhang

TL;DR

The paper addresses claim verification when multiple evidence perspectives can support a claim by formalizing Minimal Evidence Groups (MEGs) that are sufficient, non-redundant, and minimal. It proposes a two-step MEG identification method—support-prediction followed by bottom-up merging with redundancy checks—demonstrated to outperform direct LLM prompting on WiCE and SciFact-derived MEG datasets. Intrinsic evaluations show precision gains from decomposing the problem and pruning redundant evidence, while extrinsic tests demonstrate MEGs enable more compact, budget-friendly downstream claim generation. The work highlights practical implications for scalable verification and transparent reasoning, while acknowledging limitations in handling contradictions, annotation reliability, and computation time due to NP-hardness.

Abstract

Claim verification in real-world settings (e.g. against a large collection of candidate evidences retrieved from the web) typically requires identifying and aggregating a complete set of evidence pieces that collectively provide full support to the claim. The problem becomes particularly challenging when there exists distinct sets of evidence that could be used to verify the claim from different perspectives. In this paper, we formally define and study the problem of identifying such minimal evidence groups (MEGs) for claim verification. We show that MEG identification can be reduced from Set Cover problem, based on entailment inference of whether a given evidence group provides full/partial support to a claim. Our proposed approach achieves 18.4% and 34.8% absolute improvements on the WiCE and SciFact datasets over LLM prompting. Finally, we demonstrate the benefits of MEGs in downstream applications such as claim generation.

Minimal Evidence Group Identification for Claim Verification

TL;DR

The paper addresses claim verification when multiple evidence perspectives can support a claim by formalizing Minimal Evidence Groups (MEGs) that are sufficient, non-redundant, and minimal. It proposes a two-step MEG identification method—support-prediction followed by bottom-up merging with redundancy checks—demonstrated to outperform direct LLM prompting on WiCE and SciFact-derived MEG datasets. Intrinsic evaluations show precision gains from decomposing the problem and pruning redundant evidence, while extrinsic tests demonstrate MEGs enable more compact, budget-friendly downstream claim generation. The work highlights practical implications for scalable verification and transparent reasoning, while acknowledging limitations in handling contradictions, annotation reliability, and computation time due to NP-hardness.

Abstract

Claim verification in real-world settings (e.g. against a large collection of candidate evidences retrieved from the web) typically requires identifying and aggregating a complete set of evidence pieces that collectively provide full support to the claim. The problem becomes particularly challenging when there exists distinct sets of evidence that could be used to verify the claim from different perspectives. In this paper, we formally define and study the problem of identifying such minimal evidence groups (MEGs) for claim verification. We show that MEG identification can be reduced from Set Cover problem, based on entailment inference of whether a given evidence group provides full/partial support to a claim. Our proposed approach achieves 18.4% and 34.8% absolute improvements on the WiCE and SciFact datasets over LLM prompting. Finally, we demonstrate the benefits of MEGs in downstream applications such as claim generation.
Paper Structure (32 sections, 1 figure, 7 tables, 3 algorithms)

This paper contains 32 sections, 1 figure, 7 tables, 3 algorithms.

Figures (1)

  • Figure 1: The problem of minimal evidence group identification for claim verification: given a claim and a list of candidate evidence pieces, the task is to identify the sets of minimal, non-redundant evidence, where each set provides full support for the claim.