Analyzing User Characteristics of Hate Speech Spreaders on Social Media

Dominique Geissler; Abdurahman Maarouf; Stefan Feuerriegel

Analyzing User Characteristics of Hate Speech Spreaders on Social Media

Dominique Geissler, Abdurahman Maarouf, Stefan Feuerriegel

TL;DR

This work investigates who reshapes hate speech on social media and how the effects differ across hate categories. It introduces a three-step framework that clusters hate speech types, debiases exposure via inverse propensity scoring, and estimates non-linear attribute effects on resharing with an explainable boosting model while controlling for past latent vulnerability. Key findings show that users with lower social influence and older accounts tend to reshare more hate speech, with distinct patterns across racist/misogynistic, political, and toxic hate types, indicating nuanced dynamics in resharing behavior. The approach offers actionable insights for platform design and regulatory compliance, providing a flexible, explainable method to identify at-risk users and tailor mitigation strategies against hate speech spread.

Abstract

Hate speech on social media threatens the mental and physical well-being of individuals and contributes to real-world violence. Resharing is an important driver behind the spread of hate speech on social media. Yet, little is known about who reshares hate speech and what their characteristics are. In this paper, we analyze the role of user characteristics in hate speech resharing across different types of hate speech (e.g., political hate). For this, we proceed as follows: First, we cluster hate speech posts using large language models to identify different types of hate speech. Then we model the effects of user attributes on users' probability to reshare hate speech using an explainable machine learning model. To do so, we apply debiasing to control for selection bias in our observational social media data and further control for the latent vulnerability of users to hate speech. We find that, all else equal, users with fewer followers, fewer friends, fewer posts, and older accounts share more hate speech. This shows that users with little social influence tend to share more hate speech. Further, we find substantial heterogeneity across different types of hate speech. For example, racist and misogynistic hate is spread mostly by users with little social influence. In contrast, political anti-Trump and anti-right-wing hate is reshared by users with larger social influence. Overall, understanding the factors that drive users to share hate speech is crucial for detecting individuals at risk of engaging in harmful behavior and for designing effective mitigation strategies.

Analyzing User Characteristics of Hate Speech Spreaders on Social Media

TL;DR

Abstract

Paper Structure (30 sections, 10 equations, 5 figures, 9 tables)

This paper contains 30 sections, 10 equations, 5 figures, 9 tables.

Introduction
Related Work
Hate speech on social media
Inverse propensity scoring
Modeling past latent vulnerability
Methodology
Overview
Clustering hate speech (Step 1)
Modeling past latent vulnerability (Step 2)
IPS reweighting
Modeling past latent vulnerability
Modeling hate speech resharing (Step 3)
Likelihood to reshare hate speech
Effect estimation
Experimental Setup
...and 15 more sections

Figures (5)

Figure 1: Feature importance of the user attributes when predicting the users' probability to reshare hate speech.
Figure 2: Contributions of the user attributes to the users' probability to reshare hate speech as estimated by our debiasing framework (both y-axis and outcome $Y$ are shown on the same scale). Each graph is normalized for better comparability such that the average prediction on the train set is at 0. The blue curve shows the predicted effect, while the orange curves show the upper and lower bounds of our model.
Figure 3: Contributions of the user attributes to the users' probability to reshare racist and misogynistic rants as estimated by our debiasing framework (both y-axis and outcome $Y$ are shown on the same scale). Each graph is normalized for better comparability such that the average prediction on the train set is at 0. The blue curve shows the predicted effect, while the orange curves show the upper and lower bounds of our model.
Figure 4: Contributions of the user attributes to the users' probability to reshare anti-Trump and anti-right wing rants as estimated by our debiasing framework (both y-axis and outcome $Y$ are shown on the same scale). Each graph is normalized for better comparability such that the average prediction on the train set is at 0. The blue curve shows the predicted effect, while the orange curves show the upper and lower bounds of our model.
Figure 5: Contributions of the user attributes to the users' probability to reshare toxic rants as estimated by our debiasing framework (both y-axis and outcome $Y$ are shown on the same scale). Each graph is normalized for better comparability such that the average prediction on the train set is at 0. The blue curve shows the predicted effect, while the orange curves show the upper and lower bounds of our model.

Analyzing User Characteristics of Hate Speech Spreaders on Social Media

TL;DR

Abstract

Analyzing User Characteristics of Hate Speech Spreaders on Social Media

Authors

TL;DR

Abstract

Table of Contents

Figures (5)