Hoaxpedia: A Unified Wikipedia Hoax Articles Dataset
Hsuvas Borkakoty, Luis Espinosa-Anke
TL;DR
HoaxPedia presents a text-centric benchmark for detecting Wikipedia hoaxes by unifying 311 confirmed hoax articles with about 30,000 semantically similar legitimate articles. The study systematically compares surface-level features and demonstrates that while these cues are similar across hoax and legitimate articles, revision-history signals offer stronger discrimination. A broad set of experiments across BERT-family, Longformer, T5, and large language models reveals that full-text content generally yields higher performance than using just the first sentence, with Longformer achieving around 0.8 F1 in full-text settings, and RoBERTa-based models performing consistently in definition-only setups. The work provides a practical dataset and benchmarks that inform future text-based disinformation detection on Wikipedia and suggests future work on editor-based signals and balanced training to handle data imbalance.
Abstract
Hoaxes are a recognised form of disinformation created deliberately, with potential serious implications in the credibility of reference knowledge resources such as Wikipedia. What makes detecting Wikipedia hoaxes hard is that they often are written according to the official style guidelines. In this work, we first provide a systematic analysis of similarities and discrepancies between legitimate and hoax Wikipedia articles, and introduce Hoaxpedia, a collection of 311 hoax articles (from existing literature and official Wikipedia lists), together with semantically similar legitimate articles, which together form a binary text classification dataset aimed at fostering research in automated hoax detection. In this paper, We report results after analyzing several language models, hoax-to-legit ratios, and the amount of text classifiers are exposed to (full article vs the article's definition alone). Our results suggest that detecting deceitful content in Wikipedia based on content alone is hard but feasible, and complement our analysis with a study on the differences in distributions in edit histories, and find that looking at this feature yields better classification results than context.
