Multiclass Hate Speech Detection with RoBERTa-OTA: Integrating Transformer Attention and Graph Convolutional Networks

Mahmoud Abusaqer; Jamil Saquer

Multiclass Hate Speech Detection with RoBERTa-OTA: Integrating Transformer Attention and Graph Convolutional Networks

Mahmoud Abusaqer, Jamil Saquer

TL;DR

RoBERTa-OTA is proposed, which introduces ontology-guided attention mechanisms that process textual features alongside structured knowledge representations through enhanced Graph Convolutional Networks, providing practical advantages for large-scale content moderation applications requiring fine-grained demographic hate speech classification.

Abstract

Multiclass hate speech detection across demographic categories remains computationally challenging due to implicit targeting strategies and linguistic variability in social media content. Existing approaches rely solely on learned representations from training data, without explicitly incorporating structured ontological frameworks that can enhance classification through formal domain knowledge integration. We propose RoBERTa-OTA, which introduces ontology-guided attention mechanisms that process textual features alongside structured knowledge representations through enhanced Graph Convolutional Networks. The architecture combines RoBERTa embeddings with scaled attention layers and graph neural networks to integrate contextual language understanding with domain-specific semantic knowledge. Evaluation across 39,747 balanced samples using 5-fold cross-validation demonstrates significant performance gains over baseline RoBERTa implementations and existing state-of-the-art methods. RoBERTa-OTA achieves 96.04\% accuracy compared to 95.02\% for standard RoBERTa, with substantial improvements for challenging categories: gender-based hate speech detection improves by 2.36 percentage points while other hate speech categories improve by 2.38 percentage points. The enhanced architecture maintains computational efficiency with only 0.33\% parameter overhead, providing practical advantages for large-scale content moderation applications requiring fine-grained demographic hate speech classification.

Multiclass Hate Speech Detection with RoBERTa-OTA: Integrating Transformer Attention and Graph Convolutional Networks

TL;DR

Abstract

Paper Structure (16 sections, 3 equations, 2 figures, 6 tables)

This paper contains 16 sections, 3 equations, 2 figures, 6 tables.

Introduction
Literature Review
Problem Definition and Research Questions
Dataset and Preprocessing
Linguistic Validation and Class Characterization
Preprocessing Pipeline
Methodology
Baseline Architecture: RoBERTa
Novel Architecture: RoBERTa-OTA
Text Processing Stream
Ontology Processing Stream
Feature Integration and Classification
Training Configuration and Optimization
Results and Discussion
Robustness Under Social Media Text Perturbations
...and 1 more sections

Figures (2)

Figure 1: Pairwise Jensen-Shannon divergence between hate speech classes.
Figure 2: RoBERTa-OTA Architecture: Dual-Stream Processing Framework for Multiclass Hate-Speech Detection with Ontology-Guided Attention

Multiclass Hate Speech Detection with RoBERTa-OTA: Integrating Transformer Attention and Graph Convolutional Networks

TL;DR

Abstract

Multiclass Hate Speech Detection with RoBERTa-OTA: Integrating Transformer Attention and Graph Convolutional Networks

Authors

TL;DR

Abstract

Table of Contents

Figures (2)