WiDe-analysis: Enabling One-click Content Moderation Analysis on Wikipedia's Articles for Deletion
Hsuvas Borkakoty, Luis Espinosa-Anke
TL;DR
WiDe-analysis introduces a unified Python toolkit for one-click analysis of Wikipedia AfD discussions, combining data gathering, preprocessing, and NLP analyses for outcome, stance, and policy prediction, as well as sentiment and offensive-language detection. The work presents a substantial experimental evaluation showing RoBERTa-Large as the most robust performer across tasks, with LLMs underperforming relative to fine-tuned transformers in outcome prediction. A HuggingFace Space provides an accessible demo, while WiDe-powered insights demonstrate interpretable connections between textual signals and moderation outcomes. By releasing data, models, and tooling, the work aims to accelerate research on content moderation in Wikipedia and beyond, with broad applicability to runtime analysis of online discussions and policy-grounded explanations.
Abstract
Content moderation in online platforms is crucial for ensuring activity therein adheres to existing policies, especially as these platforms grow. NLP research in this area has typically focused on automating some part of it given that it is not feasible to monitor all active discussions effectively. Past works have focused on revealing deletion patterns with like sentiment analysis, or on developing platform-specific models such as Wikipedia policy or stance detectors. Unsurprisingly, however, this valuable body of work is rather scattered, with little to no agreement with regards to e.g., the deletion discussions corpora used for training or the number of stance labels. Moreover, while efforts have been made to connect stance with rationales (e.g., to ground a deletion decision on the relevant policy), there is little explanability work beyond that. In this paper, we introduce a suite of experiments on Wikipedia deletion discussions and wide-analyis (Wikipedia Deletion Analysis), a Python package aimed at providing one click analysis to content moderation discussions. We release all assets associated with wide-analysis, including data, models and the Python package, and a HuggingFace space with the goal to accelerate research on automating content moderation in Wikipedia and beyond.
