Seeing Through AI's Lens: Enhancing Human Skepticism Towards LLM-Generated Fake News
Navid Ayoobi, Sadat Shahriar, Arjun Mukherjee
TL;DR
This work tackles the problem of distinguishing human-written news from LLM-generated fake news and proposes a reader-oriented defense based on the Entropy-Shift Authorship Signature (ESAS), an information-theoretic metric that ranks words by their discriminative power for authorship. By constructing a large dataset of $39{,}000$ articles from four LLMs across three fake levels and using ESAS to select a small, highly informative vocabulary, the authors demonstrate that a simple TF-IDF plus logistic regression classifier achieves high accuracy with a tiny feature set, surpassing baseline reader judgments in many settings. They systematically analyze cues at the unigram, bigram, and POS levels, showing consistent human-vs-LLM differences such as higher use of "said" by humans and distinct model-specific bigrams like "the ongoing" for certain LLMs, across topics and generation strategies. The work culminates in practical cues readers can use to heighten skepticism toward AI-generated news and provides a publicly available dataset to foster future research on robust cross-LLM detection and reader-awareness methods.
Abstract
LLMs offer valuable capabilities, yet they can be utilized by malicious users to disseminate deceptive information and generate fake news. The growing prevalence of LLMs poses difficulties in crafting detection approaches that remain effective across various text domains. Additionally, the absence of precautionary measures for AI-generated news on online social platforms is concerning. Therefore, there is an urgent need to improve people's ability to differentiate between news articles written by humans and those produced by LLMs. By providing cues in human-written and LLM-generated news, we can help individuals increase their skepticism towards fake LLM-generated news. This paper aims to elucidate simple markers that help individuals distinguish between articles penned by humans and those created by LLMs. To achieve this, we initially collected a dataset comprising 39k news articles authored by humans or generated by four distinct LLMs with varying degrees of fake. We then devise a metric named Entropy-Shift Authorship Signature (ESAS) based on the information theory and entropy principles. The proposed ESAS ranks terms or entities, like POS tagging, within news articles based on their relevance in discerning article authorship. We demonstrate the effectiveness of our metric by showing the high accuracy attained by a basic method, i.e., TF-IDF combined with logistic regression classifier, using a small set of terms with the highest ESAS score. Consequently, we introduce and scrutinize these top ESAS-ranked terms to aid individuals in strengthening their skepticism towards LLM-generated fake news.
