Re-Search for The Truth: Multi-round Retrieval-augmented Large Language Models are Strong Fake News Detectors

Guanghua Li; Wensheng Lu; Wei Zhang; Defu Lian; Kezhong Lu; Rui Mao; Kai Shu; Hao Liao

Re-Search for The Truth: Multi-round Retrieval-augmented Large Language Models are Strong Fake News Detectors

Guanghua Li, Wensheng Lu, Wei Zhang, Defu Lian, Kezhong Lu, Rui Mao, Kai Shu, Hao Liao

TL;DR

A novel, retrieval-augmented LLMs framework to automatically and strategically extract key evidence from web sources for claim verification, the first of its kind to automatically and strategically extract key evidence from web sources for claim verification is introduced.

Abstract

The proliferation of fake news has had far-reaching implications on politics, the economy, and society at large. While Fake news detection methods have been employed to mitigate this issue, they primarily depend on two essential elements: the quality and relevance of the evidence, and the effectiveness of the verdict prediction mechanism. Traditional methods, which often source information from static repositories like Wikipedia, are limited by outdated or incomplete data, particularly for emerging or rare claims. Large Language Models (LLMs), known for their remarkable reasoning and generative capabilities, introduce a new frontier for fake news detection. However, like traditional methods, LLM-based solutions also grapple with the limitations of stale and long-tail knowledge. Additionally, retrieval-enhanced LLMs frequently struggle with issues such as low-quality evidence retrieval and context length constraints. To address these challenges, we introduce a novel, retrieval-augmented LLMs framework--the first of its kind to automatically and strategically extract key evidence from web sources for claim verification. Employing a multi-round retrieval strategy, our framework ensures the acquisition of sufficient, relevant evidence, thereby enhancing performance. Comprehensive experiments across three real-world datasets validate the framework's superiority over existing methods. Importantly, our model not only delivers accurate verdicts but also offers human-readable explanations to improve result interpretability.

Re-Search for The Truth: Multi-round Retrieval-augmented Large Language Models are Strong Fake News Detectors

TL;DR

Abstract

Paper Structure (26 sections, 3 equations, 7 figures, 10 tables)

This paper contains 26 sections, 3 equations, 7 figures, 10 tables.

Introduction
Related Work
RAG LLMs
Natural Language Inference LLMs
Methods
Retrieval Module
Reasoning Module
Re-Search Mechanism
Experiments
Experiments Setup
Datasets
Baselines
Implementation details
Main Results
Internet Search Comparison Study
...and 11 more sections

Figures (7)

Figure 1: A motivating example of our model. (a) Bert-driven methods: up-to-date evidence cannot be retrieved. (b) One-shot retrieval-enhanced LLMs: only partial evidence can be retrieved. (c) Strategic Internet-based LLMs: multi-round retrieval of evidence from the Internet facilitates more comprehensive and accurate assessments.
Figure 2: The overview of the STEEL framework. Our framework unfolds in three parts: (a) Retrieval module. Use claim or updated queries to search for evidence via the search engine, sort and select based on the similarity between the searched documents and paragraphs of the claim. (b) Reasoning module. Feed the obtained evidence and established evidence to LLMs via carefully designed prompts, and LLMs will reason and output one of the three situations "true, false, or NEI (Not Enough Information)" with confidence levels. Even when the output is "NEI", LLMs will compress the newly obtained information to the pool of established evidence for subsequent search. (c) Re-search mechanism. Re-search for more evidence when the output is "NEI" or the confidence level is below $50\%$. We use LLMs to generate "updated queries" to improve the quality of retrieval evidence.
Figure 3: Situations necessitating re-search: Irrelevant Evidence denotes evidence unrelated to the query or claim. Insufficient Evidence indicates inadequate evidence for reaching a valid conclusion. Lack of Confidence signifies uncertainty or low confidence in the conclusion's accuracy based on evidence.
Figure 4: Ablation study results: STEEL denotes complete model performance, STEEL-RR represents removal of the re-search mechanism, and STEEL-RS represents GPT-3.5-Turbo without the search module.
Figure 5: F1-Ma of various numbers of re-search rounds on three challenging claim verification datasets: LIAR (orange line), CHEF (red line), and PolitFact (blue line).
...and 2 more figures

Re-Search for The Truth: Multi-round Retrieval-augmented Large Language Models are Strong Fake News Detectors

TL;DR

Abstract

Re-Search for The Truth: Multi-round Retrieval-augmented Large Language Models are Strong Fake News Detectors

Authors

TL;DR

Abstract

Table of Contents

Figures (7)