Table of Contents
Fetching ...

Target Span Detection for Implicit Harmful Content

Nazanin Jafari, James Allan, Sheikh Muhammad Sarwar

TL;DR

This work formalizes implicit Target Span Identification (iTSI) to detect both explicit and implicit targets of hate speech and introduces the Implicit-Target-Span (ITS) dataset, built from SBIC, DynaHate, and IHC via a novel pooling-based annotation scheme combining human judgments and multiple LLMs. The authors define a formal task, present a pooling framework to create high-quality target spans, and establish a baseline TargetDetect model using BIO tagging with transformer encoders, notably RoBERTa-Large. Across cross-domain experiments and with a human-annotated dataset (PLEAD) for comparison, ITS-trained models show competitive generalization and strong cross-domain transfer, illustrating ITS's value for robust, interpretable hate-speech systems. The dataset and baseline provide a solid foundation for advancing implicit target detection, with error analysis highlighting boundary and span-count challenges for future improvement.

Abstract

Identifying the targets of hate speech is a crucial step in grasping the nature of such speech and, ultimately, in improving the detection of offensive posts on online forums. Much harmful content on online platforms uses implicit language especially when targeting vulnerable and protected groups such as using stereotypical characteristics instead of explicit target names, making it harder to detect and mitigate the language. In this study, we focus on identifying implied targets of hate speech, essential for recognizing subtler hate speech and enhancing the detection of harmful content on digital platforms. We define a new task aimed at identifying the targets even when they are not explicitly stated. To address that task, we collect and annotate target spans in three prominent implicit hate speech datasets: SBIC, DynaHate, and IHC. We call the resulting merged collection Implicit-Target-Span. The collection is achieved using an innovative pooling method with matching scores based on human annotations and Large Language Models (LLMs). Our experiments indicate that Implicit-Target-Span provides a challenging test bed for target span detection methods.

Target Span Detection for Implicit Harmful Content

TL;DR

This work formalizes implicit Target Span Identification (iTSI) to detect both explicit and implicit targets of hate speech and introduces the Implicit-Target-Span (ITS) dataset, built from SBIC, DynaHate, and IHC via a novel pooling-based annotation scheme combining human judgments and multiple LLMs. The authors define a formal task, present a pooling framework to create high-quality target spans, and establish a baseline TargetDetect model using BIO tagging with transformer encoders, notably RoBERTa-Large. Across cross-domain experiments and with a human-annotated dataset (PLEAD) for comparison, ITS-trained models show competitive generalization and strong cross-domain transfer, illustrating ITS's value for robust, interpretable hate-speech systems. The dataset and baseline provide a solid foundation for advancing implicit target detection, with error analysis highlighting boundary and span-count challenges for future improvement.

Abstract

Identifying the targets of hate speech is a crucial step in grasping the nature of such speech and, ultimately, in improving the detection of offensive posts on online forums. Much harmful content on online platforms uses implicit language especially when targeting vulnerable and protected groups such as using stereotypical characteristics instead of explicit target names, making it harder to detect and mitigate the language. In this study, we focus on identifying implied targets of hate speech, essential for recognizing subtler hate speech and enhancing the detection of harmful content on digital platforms. We define a new task aimed at identifying the targets even when they are not explicitly stated. To address that task, we collect and annotate target spans in three prominent implicit hate speech datasets: SBIC, DynaHate, and IHC. We call the resulting merged collection Implicit-Target-Span. The collection is achieved using an innovative pooling method with matching scores based on human annotations and Large Language Models (LLMs). Our experiments indicate that Implicit-Target-Span provides a challenging test bed for target span detection methods.
Paper Structure (9 sections, 2 equations, 2 figures, 5 tables)

This paper contains 9 sections, 2 equations, 2 figures, 5 tables.

Figures (2)

  • Figure 1: Example of target span detection from a hate-speech dataset.
  • Figure 2: The framework for choosing the best annotation strategy among different LLMs using a comparison with human annotator and ${F1}_{M}$ score. The strategy with the highest score is chosen.