Table of Contents
Fetching ...

Seeing Through AI's Lens: Enhancing Human Skepticism Towards LLM-Generated Fake News

Navid Ayoobi, Sadat Shahriar, Arjun Mukherjee

TL;DR

This work tackles the problem of distinguishing human-written news from LLM-generated fake news and proposes a reader-oriented defense based on the Entropy-Shift Authorship Signature (ESAS), an information-theoretic metric that ranks words by their discriminative power for authorship. By constructing a large dataset of $39{,}000$ articles from four LLMs across three fake levels and using ESAS to select a small, highly informative vocabulary, the authors demonstrate that a simple TF-IDF plus logistic regression classifier achieves high accuracy with a tiny feature set, surpassing baseline reader judgments in many settings. They systematically analyze cues at the unigram, bigram, and POS levels, showing consistent human-vs-LLM differences such as higher use of "said" by humans and distinct model-specific bigrams like "the ongoing" for certain LLMs, across topics and generation strategies. The work culminates in practical cues readers can use to heighten skepticism toward AI-generated news and provides a publicly available dataset to foster future research on robust cross-LLM detection and reader-awareness methods.

Abstract

LLMs offer valuable capabilities, yet they can be utilized by malicious users to disseminate deceptive information and generate fake news. The growing prevalence of LLMs poses difficulties in crafting detection approaches that remain effective across various text domains. Additionally, the absence of precautionary measures for AI-generated news on online social platforms is concerning. Therefore, there is an urgent need to improve people's ability to differentiate between news articles written by humans and those produced by LLMs. By providing cues in human-written and LLM-generated news, we can help individuals increase their skepticism towards fake LLM-generated news. This paper aims to elucidate simple markers that help individuals distinguish between articles penned by humans and those created by LLMs. To achieve this, we initially collected a dataset comprising 39k news articles authored by humans or generated by four distinct LLMs with varying degrees of fake. We then devise a metric named Entropy-Shift Authorship Signature (ESAS) based on the information theory and entropy principles. The proposed ESAS ranks terms or entities, like POS tagging, within news articles based on their relevance in discerning article authorship. We demonstrate the effectiveness of our metric by showing the high accuracy attained by a basic method, i.e., TF-IDF combined with logistic regression classifier, using a small set of terms with the highest ESAS score. Consequently, we introduce and scrutinize these top ESAS-ranked terms to aid individuals in strengthening their skepticism towards LLM-generated fake news.

Seeing Through AI's Lens: Enhancing Human Skepticism Towards LLM-Generated Fake News

TL;DR

This work tackles the problem of distinguishing human-written news from LLM-generated fake news and proposes a reader-oriented defense based on the Entropy-Shift Authorship Signature (ESAS), an information-theoretic metric that ranks words by their discriminative power for authorship. By constructing a large dataset of articles from four LLMs across three fake levels and using ESAS to select a small, highly informative vocabulary, the authors demonstrate that a simple TF-IDF plus logistic regression classifier achieves high accuracy with a tiny feature set, surpassing baseline reader judgments in many settings. They systematically analyze cues at the unigram, bigram, and POS levels, showing consistent human-vs-LLM differences such as higher use of "said" by humans and distinct model-specific bigrams like "the ongoing" for certain LLMs, across topics and generation strategies. The work culminates in practical cues readers can use to heighten skepticism toward AI-generated news and provides a publicly available dataset to foster future research on robust cross-LLM detection and reader-awareness methods.

Abstract

LLMs offer valuable capabilities, yet they can be utilized by malicious users to disseminate deceptive information and generate fake news. The growing prevalence of LLMs poses difficulties in crafting detection approaches that remain effective across various text domains. Additionally, the absence of precautionary measures for AI-generated news on online social platforms is concerning. Therefore, there is an urgent need to improve people's ability to differentiate between news articles written by humans and those produced by LLMs. By providing cues in human-written and LLM-generated news, we can help individuals increase their skepticism towards fake LLM-generated news. This paper aims to elucidate simple markers that help individuals distinguish between articles penned by humans and those created by LLMs. To achieve this, we initially collected a dataset comprising 39k news articles authored by humans or generated by four distinct LLMs with varying degrees of fake. We then devise a metric named Entropy-Shift Authorship Signature (ESAS) based on the information theory and entropy principles. The proposed ESAS ranks terms or entities, like POS tagging, within news articles based on their relevance in discerning article authorship. We demonstrate the effectiveness of our metric by showing the high accuracy attained by a basic method, i.e., TF-IDF combined with logistic regression classifier, using a small set of terms with the highest ESAS score. Consequently, we introduce and scrutinize these top ESAS-ranked terms to aid individuals in strengthening their skepticism towards LLM-generated fake news.
Paper Structure (13 sections, 6 equations, 4 figures, 4 tables)

This paper contains 13 sections, 6 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: The accuracy of TF-IDF in classifying the authorship of news articles based on the number of words selected for its vocabulary using ESAS metric.
  • Figure 2: The word cloud of 10 most significant bigrams, with font size representing the relative ESAS score. Terms highlighted in green and red indicate higher relative frequencies in HANA and LGNA, respectively.
  • Figure 3: The accuracy of TF-IDF in classifying the authorship of news articles based on the number of POS bigrams selected for its vocabulary using ESAS metric.
  • Figure 4: The barplots of 10 most significant POS bigrams across different LLMs for "Extended summary" prompt strategy.