Effects of algorithmic flagging on fairness: quasi-experimental evidence from Wikipedia

Nathan TeBlunthuis; Benjamin Mako Hill; Aaron Halfaker

Effects of algorithmic flagging on fairness: quasi-experimental evidence from Wikipedia

Nathan TeBlunthuis, Benjamin Mako Hill, Aaron Halfaker

TL;DR

The paper investigates whether algorithmic flagging can alleviate overprofiling and improve fairness in online community moderation, focusing on Wikipedia with the RCFilters system (backed by ORES). It uses a quasi-experimental regression discontinuity design around ORES score thresholds to causally estimate how flags influence moderator sanctions and controversial outcomes across 23 language editions. The findings show that flagging can increase sanctioning and, for some social signals, reduce unfair scrutiny (overprofiling) and controversial sanctions, though effects are heterogeneous and sensitive to context and design choices. The study contributes a methodological template for evaluating algorithmic decision-support tools in real-world sociotechnical systems and offers practical design guidance for moderation interfaces and fairness considerations.

Abstract

Online community moderators often rely on social signals such as whether or not a user has an account or a profile page as clues that users may cause problems. Reliance on these clues can lead to "overprofiling'' bias when moderators focus on these signals but overlook the misbehavior of others. We propose that algorithmic flagging systems deployed to improve the efficiency of moderation work can also make moderation actions more fair to these users by reducing reliance on social signals and making norm violations by everyone else more visible. We analyze moderator behavior in Wikipedia as mediated by RCFilters, a system which displays social signals and algorithmic flags, and estimate the causal effect of being flagged on moderator actions. We show that algorithmically flagged edits are reverted more often, especially those by established editors with positive social signals, and that flagging decreases the likelihood that moderation actions will be undone. Our results suggest that algorithmic flagging systems can lead to increased fairness in some contexts but that the relationship is complex and contingent.

Effects of algorithmic flagging on fairness: quasi-experimental evidence from Wikipedia

TL;DR

Abstract

Effects of algorithmic flagging on fairness: quasi-experimental evidence from Wikipedia

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (8)