Table of Contents
Fetching ...

Is LLMs Hallucination Usable? LLM-based Negative Reasoning for Fake News Detection

Chaowei Zhang, Zongling Feng, Zewei Zhang, Jipeng Qiang, Guandong Xu, Yun Li

TL;DR

The study investigates whether LLM knowledge hallucination can be harnessed to generate negative reasoning to aid fake news detection. It introduces SR^3 to supervise LLM outputs toward high-quality positive and negative reasoning, and builds NRFE, a dual-encoder model that learns semantic consistency between news and reasoning, along with NRFE-D, a distillation-based student that operates on news content alone. Across three datasets, NRFE-D achieves state-of-the-art performance, surpassing prompting-based LLM approaches, fine-tuned SLMs, and other methods. The work demonstrates a novel use of hallucinations as adversarial signals to improve robustness and accuracy in fake news detection, with potential for further exploration using additional LLMs and multi-agent setups.

Abstract

The questionable responses caused by knowledge hallucination may lead to LLMs' unstable ability in decision-making. However, it has never been investigated whether the LLMs' hallucination is possibly usable to generate negative reasoning for facilitating the detection of fake news. This study proposes a novel supervised self-reinforced reasoning rectification approach - SR$^3$ that yields both common reasonable reasoning and wrong understandings (negative reasoning) for news via LLMs reflection for semantic consistency learning. Upon that, we construct a negative reasoning-based news learning model called - \emph{NRFE}, which leverages positive or negative news-reasoning pairs for learning the semantic consistency between them. To avoid the impact of label-implicated reasoning, we deploy a student model - \emph{NRFE-D} that only takes news content as input to inspect the performance of our method by distilling the knowledge from \emph{NRFE}. The experimental results verified on three popular fake news datasets demonstrate the superiority of our method compared with three kinds of baselines including prompting on LLMs, fine-tuning on pre-trained SLMs, and other representative fake news detection methods.

Is LLMs Hallucination Usable? LLM-based Negative Reasoning for Fake News Detection

TL;DR

The study investigates whether LLM knowledge hallucination can be harnessed to generate negative reasoning to aid fake news detection. It introduces SR^3 to supervise LLM outputs toward high-quality positive and negative reasoning, and builds NRFE, a dual-encoder model that learns semantic consistency between news and reasoning, along with NRFE-D, a distillation-based student that operates on news content alone. Across three datasets, NRFE-D achieves state-of-the-art performance, surpassing prompting-based LLM approaches, fine-tuned SLMs, and other methods. The work demonstrates a novel use of hallucinations as adversarial signals to improve robustness and accuracy in fake news detection, with potential for further exploration using additional LLMs and multi-agent setups.

Abstract

The questionable responses caused by knowledge hallucination may lead to LLMs' unstable ability in decision-making. However, it has never been investigated whether the LLMs' hallucination is possibly usable to generate negative reasoning for facilitating the detection of fake news. This study proposes a novel supervised self-reinforced reasoning rectification approach - SR that yields both common reasonable reasoning and wrong understandings (negative reasoning) for news via LLMs reflection for semantic consistency learning. Upon that, we construct a negative reasoning-based news learning model called - \emph{NRFE}, which leverages positive or negative news-reasoning pairs for learning the semantic consistency between them. To avoid the impact of label-implicated reasoning, we deploy a student model - \emph{NRFE-D} that only takes news content as input to inspect the performance of our method by distilling the knowledge from \emph{NRFE}. The experimental results verified on three popular fake news datasets demonstrate the superiority of our method compared with three kinds of baselines including prompting on LLMs, fine-tuning on pre-trained SLMs, and other representative fake news detection methods.

Paper Structure

This paper contains 16 sections, 9 equations, 5 figures, 3 tables, 1 algorithm.

Figures (5)

  • Figure 1: The examples of news' positive and negative reasoning requested via LLMs prompting. Current studies are devoted to avoiding the occurrence of negative reasoning, yet we initially propose to leverage the ability of LLM hallucinations to generate negative reasoning coupled with positive reasoning for adversarial semantic consistency learning.
  • Figure 2: The demonstration of our proposed self-reinforced reasoning rectification approach - SR$^3$. Besides positive reasoning requests, SR$^3$ also utilizes the ability of LLM hallucination to iteratively generate negative reasoning until the generated negative reasoning satisfies the two constraints. More details of SR$^3$ can be found in Algorithm \ref{['alg:algorithm_SR$^3$']}.
  • Figure 3: The system frameworks of NRFE and NRFE-D: the two Encoders embraced in NRFE are used to represent news and reasoning, respectively. The dashed blocks in orange color are the mirroring of the blocks in orange color, which means they are parameter shared. For the knowledge distillation model - NRFE-D, its Encoder used for news representation inherits the parameters of the news Encoder fine-tuned in NRFE. NRFE-D has two learning objectives: the fused representation from NRFE (Dis Loss: $\mathcal{L}_{dis}$) and the hard label of data (CLS Loss: $\mathcal{L}'_{cls}$).
  • Figure 4: Ablation comparisons among NRFE-D and its variants in terms of accuracy and MacF1 on three datasets. RC is the RC loss used for learning the semantic correlation between news $f_x$ and reasoning $f_p$ or $f_n$ as well as RXC and XRC in Fig. \ref{['fig:Framework']}. The last variant - Only RC, denotes that RXC and XRC are disabled for ablation evaluation.
  • Figure 5: Ablation comparisons on PolitiFact dataset among the teacher model - NRFE and its four variants by visualizing their features of the last hidden layer using T-SNE. The dots in blue color denote real news, and vice versa.