Is Less Really More? Fake News Detection with Limited Information

Zhaoyang Cao; John Nguyen; Reza Zafarani

Is Less Really More? Fake News Detection with Limited Information

Zhaoyang Cao, John Nguyen, Reza Zafarani

TL;DR

This work addresses the challenge of fake news detection under information scarcity by introducing the SLIM framework, which systematically selects limited information signals (keywords, sequences, metadata) and quantifies their information content with information-theoretic measures. By leveraging XLNet-base and four input variations (keyword, sequence, metadata, multimodal), SLIM demonstrates that selective cues can match or closely approach full-text performance while markedly reducing data and compute requirements. Key findings show that 30% keyword usage yields near-parity with full-text accuracy on two benchmarks, and multimodal fusion further boosts performance, whereas metadata alone is insufficient. The approach offers a practical, efficiency-driven path for robust fake news detection in sparse-data environments and real-time applications, with broad implications for scalable, multimodal information analysis.

Abstract

The threat that online fake news and misinformation pose to democracy, justice, public confidence, and especially to vulnerable populations, has led to a sharp increase in the need for fake news detection and intervention. Whether multi-modal or pure text-based, most fake news detection methods depend on textual analysis of entire articles. However, these fake news detection methods come with certain limitations. For instance, fake news detection methods that rely on full text can be computationally inefficient, demand large amounts of training data to achieve competitive accuracy, and may lack robustness across different datasets. This is because fake news datasets have strong variations in terms of the level and types of information they provide; where some can include large paragraphs of text with images and metadata, others can be a few short sentences. Perhaps if one could only use minimal information to detect fake news, fake news detection methods could become more robust and resilient to the lack of information. We aim to overcome these limitations by detecting fake news using systematically selected, limited information that is both effective and capable of delivering robust, promising performance. We propose a framework called SLIM Systematically-selected Limited Information) for fake news detection. In SLIM, we quantify the amount of information by introducing information-theoretic measures. SLIM leverages limited information to achieve performance in fake news detection comparable to that of state-of-the-art obtained using the full text. Furthermore, by combining various types of limited information, SLIM can perform even better while significantly reducing the quantity of information required for training compared to state-of-the-art language model-based fake news detection techniques.

Is Less Really More? Fake News Detection with Limited Information

TL;DR

Abstract

Is Less Really More? Fake News Detection with Limited Information

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (6)