Table of Contents
Fetching ...

Toxic comments reduce the activity of volunteer editors on Wikipedia

Ivan Smirnov, Camelia Oprea, Markus Strohmaier

TL;DR

The paper investigates how toxic comments on Wikipedia user-talk pages affect editor behavior across six language editions. Using 57 million comments and Perspective API toxicity scores, the authors show that toxic feedback reduces short-term editor activity by approximately 0.5–2 active days per user and increases the likelihood of editors leaving, with effects amplified by higher toxicity. They demonstrate with a power-law leaving probability $P_N( ext{leaving}) \sim N^{-\alpha}$ (with $\alpha$ in the range $0.89$ to $1.02$) that toxicity elevates attrition after contributions, including a substantial early-leaving risk ($P_1$ around 0.47 for English). An agent-based model then reveals that sustained toxicity can hinder project progress, nearly eliminating long-term contributors unless new editors continually join, underscoring the need for toxicity mitigation to preserve Wikipedia’s volunteer-driven model.

Abstract

Wikipedia is one of the most successful collaborative projects in history. It is the largest encyclopedia ever created, with millions of users worldwide relying on it as the first source of information as well as for fact-checking and in-depth research. As Wikipedia relies solely on the efforts of its volunteer-editors, its success might be particularly affected by toxic speech. In this paper, we analyze all 57 million comments made on user talk pages of 8.5 million editors across the six most active language editions of Wikipedia to study the potential impact of toxicity on editors' behaviour. We find that toxic comments consistently reduce the activity of editors, leading to an estimated loss of 0.5-2 active days per user in the short term. This amounts to multiple human-years of lost productivity when considering the number of active contributors to Wikipedia. The effects of toxic comments are even greater in the long term, as they significantly increase the risk of editors leaving the project altogether. Using an agent-based model, we demonstrate that toxicity attacks on Wikipedia have the potential to impede the progress of the entire project. Our results underscore the importance of mitigating toxic speech on collaborative platforms such as Wikipedia to ensure their continued success.

Toxic comments reduce the activity of volunteer editors on Wikipedia

TL;DR

The paper investigates how toxic comments on Wikipedia user-talk pages affect editor behavior across six language editions. Using 57 million comments and Perspective API toxicity scores, the authors show that toxic feedback reduces short-term editor activity by approximately 0.5–2 active days per user and increases the likelihood of editors leaving, with effects amplified by higher toxicity. They demonstrate with a power-law leaving probability (with in the range to ) that toxicity elevates attrition after contributions, including a substantial early-leaving risk ( around 0.47 for English). An agent-based model then reveals that sustained toxicity can hinder project progress, nearly eliminating long-term contributors unless new editors continually join, underscoring the need for toxicity mitigation to preserve Wikipedia’s volunteer-driven model.

Abstract

Wikipedia is one of the most successful collaborative projects in history. It is the largest encyclopedia ever created, with millions of users worldwide relying on it as the first source of information as well as for fact-checking and in-depth research. As Wikipedia relies solely on the efforts of its volunteer-editors, its success might be particularly affected by toxic speech. In this paper, we analyze all 57 million comments made on user talk pages of 8.5 million editors across the six most active language editions of Wikipedia to study the potential impact of toxicity on editors' behaviour. We find that toxic comments consistently reduce the activity of editors, leading to an estimated loss of 0.5-2 active days per user in the short term. This amounts to multiple human-years of lost productivity when considering the number of active contributors to Wikipedia. The effects of toxic comments are even greater in the long term, as they significantly increase the risk of editors leaving the project altogether. Using an agent-based model, we demonstrate that toxicity attacks on Wikipedia have the potential to impede the progress of the entire project. Our results underscore the importance of mitigating toxic speech on collaborative platforms such as Wikipedia to ensure their continued success.
Paper Structure (15 sections, 1 equation, 8 figures, 5 tables)

This paper contains 15 sections, 1 equation, 8 figures, 5 tables.

Figures (8)

  • Figure 1: After receiving a toxic comment many users temporarily reduce their activity or leave the project completely. The figure shows the activity of 50 randomly selected users who received exactly one toxic comment. Blue squares indicate an active day, i.e. a day when at least one edit was done, starting from the first contribution of a given user. Red triangles correspond to toxic comments. Note that while some users are resilient and their activity is seemingly unaffected by toxic comments, many users temporarily reduce their activity or stop contributing altogether.
  • Figure 2: After receiving a toxic comment, editors become less active. On average, users are more active near the time when they receive a toxic comment (peak at zero for the red line in panel a). Average activity across all users who have received a toxic comment is lower in all $100$ days after the event compared to the corresponding days before (dashed and solid red lines in panel b). This cannot be explained by a baseline drop in activity after a non-toxic comment (dashed and solid blue lines in panel b). Similar results hold not only for the English edition but also for the other five editions (c-g).
  • Figure 3: The probability of leaving Wikipedia after receiving a toxic comment is substantially higher than might be expected otherwise. For all six editions the probability of leaving declines with the number of contributions approximately following the power law. At the same time, this probability is substantially higher after receiving a toxic comment than might be expected otherwise. Dots are probability estimates and solid lines are the best linear fit on a log-log scale.
  • Figure 4: Comparison of the number of users over time in a non-toxic vs. a toxic environment. In a non-toxic environment, the initial population of $1000$ users (a) stabilizes at around $200$, whereas in a toxic environment, the population nearly dies out. If users are constantly arriving to sustain the population in a toxic environment (b), it would lead to growth in a non-toxic environment. When users arrive only up to a certain point in time (c), the peak activity of the project is substantially lower in a toxic environment compared to a non-toxic environment and activity eventually dies out.
  • Figure S1: Lost activity estimates as a function of toxicity threshold and activity level of users We find that our results are robust with respect to toxicity threshold and filtering out less active users. For visual clarity, the 95% confidence intervals (shaded regions) are shown only for $0.2$ and $0.9$ thresholds.
  • ...and 3 more figures