The Landscape of Toxicity: An Empirical Investigation of Toxicity on GitHub
Jaydeb Sarker, Asif Kamal Turzo, Amiangshu Bosu
TL;DR
This study quantitatively maps toxicity in GitHub OSS by analyzing 2,828 projects with 16 million PRs and 101.5 million comments using ToxiCR, validated with manual annotation. It employs multinomial logistic regression and bootstrapped logistic regression across 32 project/PR/participant attributes to identify factors associated with toxicity. Key findings include profanity as the dominant form, higher toxicity in popular and gaming projects, negative association with issue resolution rates and corporate sponsorship, and repeat-offender effects, with limited predictive power for individual participants. The work provides context-aware mitigation guidance and publicly releases the dataset and scripts to support broader replication and SEO indexing.
Abstract
Toxicity on GitHub can severely impact Open Source Software (OSS) development communities. To mitigate such behavior, a better understanding of its nature and how various measurable characteristics of project contexts and participants are associated with its prevalence is necessary. To achieve this goal, we conducted a large-scale mixed-method empirical study of 2,828 GitHub-based OSS projects randomly selected based on a stratified sampling strategy. Using ToxiCR, an SE domain-specific toxicity detector, we automatically classified each comment as toxic or non-toxic. Additionally, we manually analyzed a random sample of 600 comments to validate ToxiCR's performance and gain insights into the nature of toxicity within our dataset. The results of our study suggest that profanity is the most frequent toxicity on GitHub, followed by trolling and insults. While a project's popularity is positively associated with the prevalence of toxicity, its issue resolution rate has the opposite association. Corporate-sponsored projects are less toxic, but gaming projects are seven times more toxic than non-gaming ones. OSS contributors who have authored toxic comments in the past are significantly more likely to repeat such behavior. Moreover, such individuals are more likely to become targets of toxic texts.
