An exploration of features to improve the generalisability of fake news detection models
Nathaniel Hoy, Theodora Koulouri
TL;DR
This study interrogates why fake news detectors fail to generalise beyond coarse, publisher-based labels. It demonstrates that token-based representations and LLMs underperform on real-world, manually labelled data, while stylistic features and a novel set of social-monetisation signals offer more robust generalisation across datasets. Through two experiments on NELA 2020-21 and external Facebook URL data, the authors show that stylistic and monetisation features not only achieve competitive accuracy but also balance precision and recall, reducing bias from topical signals. A reduced feature set maintains cross-dataset performance with improved efficiency, underscoring the value of feature engineering beyond text alone for practical fake news detection. These findings highlight a path toward robust, deployable systems that resist dataset biases and perform better in real-world settings.
Abstract
Fake news poses global risks by influencing elections and spreading misinformation, making detection critical. Existing NLP and supervised Machine Learning methods perform well under cross-validation but struggle to generalise across datasets, even within the same domain. This issue stems from coarsely labelled training data, where articles are labelled based on their publisher, introducing biases that token-based models like TF-IDF and BERT are sensitive to. While Large Language Models (LLMs) offer promise, their application in fake news detection remains limited. This study demonstrates that meaningful features can still be extracted from coarsely labelled data to improve real-world robustness. Stylistic features-lexical, syntactic, and semantic-are explored due to their reduced sensitivity to dataset biases. Additionally, novel social-monetisation features are introduced, capturing economic incentives behind fake news, such as advertisements, external links, and social media elements. The study trains on the coarsely labelled NELA 2020-21 dataset and evaluates using the manually labelled Facebook URLs dataset, a gold standard for generalisability. Results highlight the limitations of token-based models trained on biased data and contribute to the scarce evidence on LLMs like LLaMa in this field. Findings indicate that stylistic and social-monetisation features offer more generalisable predictions than token-based methods and LLMs. Statistical and permutation feature importance analyses further reveal their potential to enhance performance and mitigate dataset biases, providing a path forward for improving fake news detection.
