The Peripatetic Hater: Predicting Movement Among Hate Subreddits

Daniel Hickey; Daniel M. T. Fessler; Kristina Lerman; Keith Burghardt

The Peripatetic Hater: Predicting Movement Among Hate Subreddits

Daniel Hickey, Daniel M. T. Fessler, Kristina Lerman, Keith Burghardt

TL;DR

It is shown that when users become active in their first hate subreddit, they have a high likelihood of becoming active in additional hate subreddits of a different category, and that users who join additional hate subreddits, especially those of a different category develop a wider hate group lexicon.

Abstract

Many online hate groups exist to disparage others based on race, gender identity, sex, or other characteristics. The accessibility of these communities allows users to join multiple types of hate groups (e.g., a racist community and a misogynistic community), raising the question of whether users who join additional types of hate communities could be further radicalized compared to users who stay in one type of hate group. However, little is known about the dynamics of joining multiple types of hate groups, nor the effect of these groups on peripatetic users. We develop a new method to classify hate subreddits and the identities they disparage, then apply it to understand better how users come to join different types of hate subreddits. The hate classification technique utilizes human-validated deep learning models to extract the protected identities attacked, if any, across 168 subreddits. We find distinct clusters of subreddits targeting various identities, such as racist subreddits, xenophobic subreddits, and transphobic subreddits. We show that when users become active in their first hate subreddit, they have a high likelihood of becoming active in additional hate subreddits of a different category. We also find that users who join additional hate subreddits, especially those of a different category develop a wider hate group lexicon. These results then lead us to train a deep learning model that, as we demonstrate, usefully predicts the hate categories in which users will become active based on post text replied to and written. The accuracy of this model may be partly driven by peripatetic users often using the language of hate subreddits they eventually join. Overall, these results highlight the unique risks associated with hate communities on a social media platform, as discussion of alternative targets of hate may lead users to target more protected identities.

The Peripatetic Hater: Predicting Movement Among Hate Subreddits

TL;DR

Abstract

Paper Structure (17 sections, 13 figures, 5 tables)

This paper contains 17 sections, 13 figures, 5 tables.

Introduction
Related Work
Types of Online Hate Communities
Radicalization Pathways and Gateway Communities
Movement Among Online Communities
Methods
Collecting a Dataset of Hate Subreddits
User Matching
Peripatetic users and ingroup language
Extracting Topics of Discussion Used by Peripatetic Users
Predicting Participation In Hate Subreddit Types
Results
Discussion
Limitations and Future Directions
Acknowledgements
...and 2 more sections

Figures (13)

Figure 1: UMAP plot of hate speech distributions for each subreddit. Each point represents a subreddit, and their color corresponds with the K-means cluster they have been assigned to.
Figure 2: Z-scores of the identities attacked in each cluster. Clusters are titled by the category of hate with the highest z-score (most over-represented).
Figure 3: Proportion of users who post in alternative subreddit categories within six weeks after becoming active in their initial hate subreddit category. Error bars represent standard errors.
Figure 4: Heatmaps representing peripatetic user behavior (A) Rate of users from the original subreddit type that subsequently posted in the alternate subreddit. (B) Odds ratios of peripatetic vs. non-peripatetic users using language from the alternate hate subreddit lexicon within the origin subreddit type. (C) Change in use of language from alternate hate subreddit lexicons after posting in alternate hate subreddits. In (B) and (C), cells with numbers in them represent statistical significance (p-values $< 0.05$).
Figure 5: Most frequent topics by category of subreddit. The number of posts within each topic is displayed on the x-axis of each figure. Stopwords were removed from topic representations.
...and 8 more figures

The Peripatetic Hater: Predicting Movement Among Hate Subreddits

TL;DR

Abstract

The Peripatetic Hater: Predicting Movement Among Hate Subreddits

Authors

TL;DR

Abstract

Table of Contents

Figures (13)