Table of Contents
Fetching ...

Prompt-Induced Linguistic Fingerprints for LLM-Generated Fake News Detection

Chi Wang, Min Gao, Zongwei Wang, Junwei Yin, Kai Shu, Chenghua Lin

Abstract

With the rapid development of large language models, the generation of fake news has become increasingly effortless, posing a growing societal threat and underscoring the urgent need for reliable detection methods. Early efforts to identify LLM-generated fake news have predominantly focused on the textual content itself; however, because much of that content may appear coherent and factually consistent, the subtle traces of falsification are often difficult to uncover. Through distributional divergence analysis, we uncover prompt-induced linguistic fingerprints: statistically distinct probability shifts between LLM-generated real and fake news when maliciously prompted. Based on this insight, we propose a novel method named Linguistic Fingerprints Extraction (LIFE). By reconstructing word-level probability distributions, LIFE can find discriminative patterns that facilitate the detection of LLM-generated fake news. To further amplify these fingerprint patterns, we also leverage key-fragment techniques that accentuate subtle linguistic differences, thereby improving detection reliability. Our experiments show that LIFE achieves state-of-the-art performance in LLM-generated fake news and maintains high performance in human-written fake news. The code and data are available at https://anonymous.4open.science/r/LIFE-E86A.

Prompt-Induced Linguistic Fingerprints for LLM-Generated Fake News Detection

Abstract

With the rapid development of large language models, the generation of fake news has become increasingly effortless, posing a growing societal threat and underscoring the urgent need for reliable detection methods. Early efforts to identify LLM-generated fake news have predominantly focused on the textual content itself; however, because much of that content may appear coherent and factually consistent, the subtle traces of falsification are often difficult to uncover. Through distributional divergence analysis, we uncover prompt-induced linguistic fingerprints: statistically distinct probability shifts between LLM-generated real and fake news when maliciously prompted. Based on this insight, we propose a novel method named Linguistic Fingerprints Extraction (LIFE). By reconstructing word-level probability distributions, LIFE can find discriminative patterns that facilitate the detection of LLM-generated fake news. To further amplify these fingerprint patterns, we also leverage key-fragment techniques that accentuate subtle linguistic differences, thereby improving detection reliability. Our experiments show that LIFE achieves state-of-the-art performance in LLM-generated fake news and maintains high performance in human-written fake news. The code and data are available at https://anonymous.4open.science/r/LIFE-E86A.

Paper Structure

This paper contains 37 sections, 18 equations, 6 figures, 5 tables.

Figures (6)

  • Figure 1: Linguistic Fingerprints: A comparison of original-token level reconstruction probabilities between real and fake news under malicious prompting. Colored tokens denote the reconstructed tokens. The example shows that reconstruction assigns lower probabilities to original tokens in real news than in fake news, illustrating the prompt-induced linguistic fingerprints of the distributional divergence in token-level generation probabilities.
  • Figure 2: Distribution of token reconstruction probabilities (in -log scale) for real and fake news. (a) shows the overall probability distributions on the GossipCop++ dataset, while (b) shows the corresponding distributions on the VLFPN dataset. Fake news tends to have higher reconstruction probabilities reflected by a left shift in the -log probability distribution.
  • Figure 3: Overall framework of LIFE. It consists of three main stages. (1) Key fragment extraction: a pre-trained model evaluates the classification loss difference between the original and masked news to identify critical sentences. (2) Reconstruction probability acquisition: an LLM guided by a malicious prompt reconstructs token-level probabilities for the selected key fragments. (3) Classification: the obtained probability vectors are used to train a classifier that distinguishes fake from real news.
  • Figure 4: Ablation study on four datasets.
  • Figure 5: A case study on the differences in key sentences and reconstruction probability distributions. (a) and (b) show the visualizations of key sentences from real and fake news, respectively. (c) and (d) compare the reconstruction probabilities of all words in the complete news articles and the corresponding key sentences.
  • ...and 1 more figures