Table of Contents
Fetching ...

Predictively Combatting Toxicity in Health-related Online Discussions through Machine Learning

Jorge Paz-Ruza, Amparo Alonso-Betanzos, Bertha Guijarro-Berdiñas, Carlos Eiras-Franco

TL;DR

This work reframes online toxicity mitigation in health contexts as a predictive, dyadic task between users and subcommunities. By combining Detoxify-derived toxicity signals with a Matrix Factorization–based Collaborative Filtering model, it learns latent toxicity characteristics of users and subreddits to forecast future interactions, framing the problem as binary classification. The authors introduce a novel adaptation of the LOLI data-splitting method for binary dyadic tasks and demonstrate that their approach achieves ~0.83 G-Mean on Reddit COVID-related data, outperforming simple baselines. The results suggest that pre-emptively steering users away from potentially toxic subforums could reduce harmful content and moderation costs, with future work aiming to extend the approach to other platforms and incorporate richer textual/temporal cues.

Abstract

In health-related topics, user toxicity in online discussions frequently becomes a source of social conflict or promotion of dangerous, unscientific behaviour; common approaches for battling it include different forms of detection, flagging and/or removal of existing toxic comments, which is often counterproductive for platforms and users alike. In this work, we propose the alternative of combatting user toxicity predictively, anticipating where a user could interact toxically in health-related online discussions. Applying a Collaborative Filtering-based Machine Learning methodology, we predict the toxicity in COVID-related conversations between any user and subcommunity of Reddit, surpassing 80% predictive performance in relevant metrics, and allowing us to prevent the pairing of conflicting users and subcommunities.

Predictively Combatting Toxicity in Health-related Online Discussions through Machine Learning

TL;DR

This work reframes online toxicity mitigation in health contexts as a predictive, dyadic task between users and subcommunities. By combining Detoxify-derived toxicity signals with a Matrix Factorization–based Collaborative Filtering model, it learns latent toxicity characteristics of users and subreddits to forecast future interactions, framing the problem as binary classification. The authors introduce a novel adaptation of the LOLI data-splitting method for binary dyadic tasks and demonstrate that their approach achieves ~0.83 G-Mean on Reddit COVID-related data, outperforming simple baselines. The results suggest that pre-emptively steering users away from potentially toxic subforums could reduce harmful content and moderation costs, with future work aiming to extend the approach to other platforms and incorporate richer textual/temporal cues.

Abstract

In health-related topics, user toxicity in online discussions frequently becomes a source of social conflict or promotion of dangerous, unscientific behaviour; common approaches for battling it include different forms of detection, flagging and/or removal of existing toxic comments, which is often counterproductive for platforms and users alike. In this work, we propose the alternative of combatting user toxicity predictively, anticipating where a user could interact toxically in health-related online discussions. Applying a Collaborative Filtering-based Machine Learning methodology, we predict the toxicity in COVID-related conversations between any user and subcommunity of Reddit, surpassing 80% predictive performance in relevant metrics, and allowing us to prevent the pairing of conflicting users and subcommunities.

Paper Structure

This paper contains 11 sections, 2 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: Acquisition, filtering and pre-processing processes for online Reddit comment data.
  • Figure 3: The Matrix Factorisation (MF) paradigm. The full interaction matrix is modelled as two low-dimensional matrices $U$ and $V$, representing the latent characteristics of users and items, respectively. The resulting value of any interaction can be obtained as the scalar product of a column of $V$ and a row of $U$.
  • Figure 4: Adaptation of the LOLI partitioning strategy meng2020exploring for binary classification on dyadic data. Note that this process is repeated to split the training set into the final training and validation sets.
  • Figure 5: Topology of the Machine Learning model proposed to predict the toxicity of health-related conversations in unobserved user-subreddit interactions on the Reddit platform. The model receives the user identifiers $u$ and subreddit $s$, and predicts the expected toxicity $tox(C(u,s)) \in [0,1]$ from user behaviour on the subreddit. $d$ is the number of latent features used.