Learning for Detecting Norm Violation in Online Communities
Thiago Freitas dos Santos, Nardine Osman, Marco Schorlemmer
TL;DR
This paper tackles the challenge of interpreting and enforcing online community norms by proposing a normative violation detection framework (FNVD) that not only detects violations but also explains to violators which features of their actions triggered the violation. FNVD combines Logistic Model Trees for interpretable probabilistic classification with K-Means clustering to identify the most influential action features, guided by a domain-specific taxonomy for explanation. The authors apply FNVD to Wikipedia vandalism detection, using a feature-rich taxonomy over 61 features derived from edit data and evaluated with 10-fold cross-validation, achieving an overall accuracy of $96\%$ and providing actionable feedback to violators. The work demonstrates a path toward adaptive, explainable norm enforcement in online communities, with online retraining and feedback mechanisms proposed for handling evolving norms and broader domains.
Abstract
In this paper, we focus on normative systems for online communities. The paper addresses the issue that arises when different community members interpret these norms in different ways, possibly leading to unexpected behavior in interactions, usually with norm violations that affect the individual and community experiences. To address this issue, we propose a framework capable of detecting norm violations and providing the violator with information about the features of their action that makes this action violate a norm. We build our framework using Machine Learning, with Logistic Model Trees as the classification algorithm. Since norm violations can be highly contextual, we train our model using data from the Wikipedia online community, namely data on Wikipedia edits. Our work is then evaluated with the Wikipedia use case where we focus on the norm that prohibits vandalism in Wikipedia edits.
