Table of Contents
Fetching ...

Contradiction to Consensus: Dual Perspective, Multi Source Retrieval Based Claim Verification with Source Level Disagreement using LLM

Md Badsha Biswas, Ozlem Uzuner

TL;DR

This work presents a novel system for open-domain claim verification (ODCV) that leverages large language models (LLMs), multi-perspective evidence retrieval, and cross-source disagreement analysis that shows that knowledge aggregation not only improves claim verification but also reveals differences in source-specific reasoning.

Abstract

The spread of misinformation across digital platforms can pose significant societal risks. Claim verification, a.k.a. fact-checking, systems can help identify potential misinformation. However, their efficacy is limited by the knowledge sources that they rely on. Most automated claim verification systems depend on a single knowledge source and utilize the supporting evidence from that source; they ignore the disagreement of their source with others. This limits their knowledge coverage and transparency. To address these limitations, we present a novel system for open-domain claim verification (ODCV) that leverages large language models (LLMs), multi-perspective evidence retrieval, and cross-source disagreement analysis. Our approach introduces a novel retrieval strategy that collects evidence for both the original and the negated forms of a claim, enabling the system to capture supporting and contradicting information from diverse sources: Wikipedia, PubMed, and Google. These evidence sets are filtered, deduplicated, and aggregated across sources to form a unified and enriched knowledge base that better reflects the complexity of real-world information. This aggregated evidence is then used for claim verification using LLMs. We further enhance interpretability by analyzing model confidence scores to quantify and visualize inter-source disagreement. Through extensive evaluation on four benchmark datasets with five LLMs, we show that knowledge aggregation not only improves claim verification but also reveals differences in source-specific reasoning. Our findings underscore the importance of embracing diversity, contradiction, and aggregation in evidence for building reliable and transparent claim verification systems

Contradiction to Consensus: Dual Perspective, Multi Source Retrieval Based Claim Verification with Source Level Disagreement using LLM

TL;DR

This work presents a novel system for open-domain claim verification (ODCV) that leverages large language models (LLMs), multi-perspective evidence retrieval, and cross-source disagreement analysis that shows that knowledge aggregation not only improves claim verification but also reveals differences in source-specific reasoning.

Abstract

The spread of misinformation across digital platforms can pose significant societal risks. Claim verification, a.k.a. fact-checking, systems can help identify potential misinformation. However, their efficacy is limited by the knowledge sources that they rely on. Most automated claim verification systems depend on a single knowledge source and utilize the supporting evidence from that source; they ignore the disagreement of their source with others. This limits their knowledge coverage and transparency. To address these limitations, we present a novel system for open-domain claim verification (ODCV) that leverages large language models (LLMs), multi-perspective evidence retrieval, and cross-source disagreement analysis. Our approach introduces a novel retrieval strategy that collects evidence for both the original and the negated forms of a claim, enabling the system to capture supporting and contradicting information from diverse sources: Wikipedia, PubMed, and Google. These evidence sets are filtered, deduplicated, and aggregated across sources to form a unified and enriched knowledge base that better reflects the complexity of real-world information. This aggregated evidence is then used for claim verification using LLMs. We further enhance interpretability by analyzing model confidence scores to quantify and visualize inter-source disagreement. Through extensive evaluation on four benchmark datasets with five LLMs, we show that knowledge aggregation not only improves claim verification but also reveals differences in source-specific reasoning. Our findings underscore the importance of embracing diversity, contradiction, and aggregation in evidence for building reliable and transparent claim verification systems
Paper Structure (25 sections, 5 equations, 7 figures, 4 tables)

This paper contains 25 sections, 5 equations, 7 figures, 4 tables.

Figures (7)

  • Figure 1: Workflow of the proposed system. A claim and its negation are used to retrieve sentence-level evidence from Wikipedia, PubMed, and Google; candidate sentences are ranked and deduplicated, keeping the top-$p$ per source. The resulting candidate sentences from each source are aggregated into a cumulative evidence set $E_i$, which a zero-shot LLM uses to verify claims; log-probabilities for each source are reported and visualized to convey agreement and uncertainty.
  • Figure 2: Confidence distribution (KDE) across knowledge sources for the Averitec dataset, illustrating variation in model certainty and inter-source disagreement for claim verification
  • Figure 3: Confidence distribution (KDE) across different knowledge sources for the LIAR dataset, illustrating variation in model certainty and inter-source disagreement during claim verification
  • Figure 4: Confidence distribution (KDE) across different knowledge sources for the Pubhealth dataset, illustrating variation in model certainty and inter-source disagreement during claim verification
  • Figure 5: Confidence distribution (KDE) across different knowledge sources for the SCIFact dataset, illustrating variation in model certainty and inter-source disagreement during claim verification
  • ...and 2 more figures