Table of Contents
Fetching ...

Multi-Modal Discussion Transformer: Integrating Text, Images and Graph Transformers to Detect Hate Speech on Social Media

Liam Hebert, Gaurav Sahu, Yuxuan Guo, Nanda Kishore Sreenivas, Lukasz Golab, Robin Cohen

TL;DR

The paper tackles hate speech detection in online discussions by moving beyond single-comment text to a holistic, multi-modal analysis that includes images and the surrounding discussion graph. It introduces the Multi-Modal Discussion Transformer (mDT), which interleaves modality fusion with graph Transformer layers through bottleneck tokens and employs a hierarchical spatial encoding based on Cantor’s pairing to capture discussion structure. A new benchmark, HatefulDiscussions, comprises 8266 Reddit discussions with 18359 labelled comments across 850 communities, enabling evaluation of complete multi-modal discussion graphs. Empirically, mDT outperforms text-only and prior graph-based baselines, with strong gains in accuracy and F1, and ablations show the critical roles of images and contextual graph information. The work advances robust, context-aware hate speech detection and provides a foundation for future multi-modal, discourse-grounded models with public data and code release under permissive licenses.

Abstract

We present the Multi-Modal Discussion Transformer (mDT), a novel methodfor detecting hate speech in online social networks such as Reddit discussions. In contrast to traditional comment-only methods, our approach to labelling a comment as hate speech involves a holistic analysis of text and images grounded in the discussion context. This is done by leveraging graph transformers to capture the contextual relationships in the discussion surrounding a comment and grounding the interwoven fusion layers that combine text and image embeddings instead of processing modalities separately. To evaluate our work, we present a new dataset, HatefulDiscussions, comprising complete multi-modal discussions from multiple online communities on Reddit. We compare the performance of our model to baselines that only process individual comments and conduct extensive ablation studies.

Multi-Modal Discussion Transformer: Integrating Text, Images and Graph Transformers to Detect Hate Speech on Social Media

TL;DR

The paper tackles hate speech detection in online discussions by moving beyond single-comment text to a holistic, multi-modal analysis that includes images and the surrounding discussion graph. It introduces the Multi-Modal Discussion Transformer (mDT), which interleaves modality fusion with graph Transformer layers through bottleneck tokens and employs a hierarchical spatial encoding based on Cantor’s pairing to capture discussion structure. A new benchmark, HatefulDiscussions, comprises 8266 Reddit discussions with 18359 labelled comments across 850 communities, enabling evaluation of complete multi-modal discussion graphs. Empirically, mDT outperforms text-only and prior graph-based baselines, with strong gains in accuracy and F1, and ablations show the critical roles of images and contextual graph information. The work advances robust, context-aware hate speech detection and provides a foundation for future multi-modal, discourse-grounded models with public data and code release under permissive licenses.

Abstract

We present the Multi-Modal Discussion Transformer (mDT), a novel methodfor detecting hate speech in online social networks such as Reddit discussions. In contrast to traditional comment-only methods, our approach to labelling a comment as hate speech involves a holistic analysis of text and images grounded in the discussion context. This is done by leveraging graph transformers to capture the contextual relationships in the discussion surrounding a comment and grounding the interwoven fusion layers that combine text and image embeddings instead of processing modalities separately. To evaluate our work, we present a new dataset, HatefulDiscussions, comprising complete multi-modal discussions from multiple online communities on Reddit. We compare the performance of our model to baselines that only process individual comments and conduct extensive ablation studies.
Paper Structure (20 sections, 5 equations, 5 figures, 8 tables)

This paper contains 20 sections, 5 equations, 5 figures, 8 tables.

Figures (5)

  • Figure 1: Multi-Modal Discussion Transformer
  • Figure 2: Example Discussion Structure. Each node in the discussion tree represents a comment. The shortest distance between (a, c) and (b, d) is equivalent, demonstrating a lack of expressiveness towards hierarchy.
  • Figure 3: Fine-grained distribution of BERT and mDT misclassification. (Acronyms above as in Table \ref{['tab:dist_labels']})
  • Figure 4: An image present in the discussion context of example 3 (Table \ref{['tab:BERT_vs_graph_examples']}), seen only by mDT, contextualizing comments as potentially hateful
  • Figure 5: An image present in discussion context of example 2 (Table \ref{['tab:BERT_vs_graph_examples']}), seen only by mDT, contextualizing comments as potentially hateful