Modelling the Spread of Toxicity and Exploring its Mitigation on Online Social Networks

Aatman Vaidya; Harsh Bhagat; Seema Nagar; Amit A. Nanavati

Modelling the Spread of Toxicity and Exploring its Mitigation on Online Social Networks

Aatman Vaidya, Harsh Bhagat, Seema Nagar, Amit A. Nanavati

TL;DR

The paper reframes hate toxicity in online networks as a transformation process where users act as transformers that apply a shift to incoming toxicity before forwarding it. It introduces a three‑category taxonomy—Amplifiers, Attenuators, Copycats—and a shift‑based propagation model, validated with temporal analysis on Twitter, Gab, and Koo showing non‑conservation of toxicity and limited homophily among changing users. An intervention, peace‑bots that emit zero toxicity, is proposed and evaluated, revealing that mitigation effectiveness depends on network topology and bot placement; no universal deployment strategy exists. The findings highlight the need for network‑aware moderation approaches and offer a principled soft‑intervention framework to reduce exposure to toxic content without removing users or links. The work provides actionable insights for platform moderators and informs moderation policy with a dynamics grounded in real data and systematic simulations, formalized through $O(u,t) = I_{avg}(u,t) + s(c(u,t), I_{avg}(u,t))$ and category‑dependent shift sampling.

Abstract

Hate speech on online platforms has been credibly linked to multiple instances of real world violence. This calls for an urgent need to understand how toxic content spreads and how it might be mitigated on online social networks, and expectedly has been the topic of extensive research in recent times. Prior work has largely modelled hate through epidemic or spread activation based diffusion models, in which the users are often divided into two categories, hateful or not. In this work, users are treated as transformers of toxicity, based on how they respond to incoming toxicity. Compared with the incoming toxicity, users amplify, attenuate, or replicate (effectively, transform) the toxicity and send it forward. We do a temporal analysis of toxicity on Twitter, Koo and Gab and find that (a) toxicity is not conserved in the network; (b) only a subset of users change behaviour over time; and (c) there is no evidence of homophily among behaviour-changing users. In our model, each user transforms incoming toxicity by applying a "shift" to it prior to sending it forward. Based on this, we develop a network model of toxicity spread that incorporates time-varying behaviour of users. We find that the "shift" applied by a user is dependent on the input toxicity and the category. Based on this finding, we propose an intervention strategy for toxicity reduction. This is simulated by deploying peace-bots. Through experiments on both real-world and synthetic networks, we demonstrate that peace-bot interventions can reduce toxicity, though their effectiveness depends on network structure and placement strategy.

Modelling the Spread of Toxicity and Exploring its Mitigation on Online Social Networks

TL;DR

Abstract

Modelling the Spread of Toxicity and Exploring its Mitigation on Online Social Networks

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (9)