Table of Contents
Fetching ...

Hate Speech According to the Law: An Analysis for Effective Detection

Katerina Korre, John Pavlopoulos, Paolo Gajo, Alberto Barrón-Cedeño

TL;DR

This study tackles prosecutable hate speech detection under heterogeneous national laws by creating an expert-annotated dataset across Greece, Italy, and the United Kingdom and evaluating four PLMs alongside two large language models. It investigates law-informed zero-shot and few-shot prompting, as well as leave-one-out cross-validation, and explores the use of LLM-generated silver data to boost performance. The findings show that legal knowledge improves classification in some contexts, but LLM-generated data do not consistently enhance PLM performance, and cross-country differences in law influence inter-annotator agreement and model outcomes. The work provides a valuable cross-country dataset and insight into how legal frameworks interact with NLP models, offering guidance for developing more reliable prosecutable hate speech detectors within diverse legal regimes.

Abstract

The issue of hate speech extends beyond the confines of the online realm. It is a problem with real-life repercussions, prompting most nations to formulate legal frameworks that classify hate speech as a punishable offence. These legal frameworks differ from one country to another, contributing to the big chaos that online platforms have to face when addressing reported instances of hate speech. With the definitions of hate speech falling short in introducing a robust framework, we turn our gaze onto hate speech laws. We consult the opinion of legal experts on a hate speech dataset and we experiment by employing various approaches such as pretrained models both on hate speech and legal data, as well as exploiting two large language models (Qwen2-7B-Instruct and Meta-Llama-3-70B). Due to the time-consuming nature of data acquisition for prosecutable hate speech, we use pseudo-labeling to improve our pretrained models. This study highlights the importance of amplifying research on prosecutable hate speech and provides insights into effective strategies for combating hate speech within the parameters of legal frameworks. Our findings show that legal knowledge in the form of annotations can be useful when classifying prosecutable hate speech, yet more focus should be paid on the differences between the laws.

Hate Speech According to the Law: An Analysis for Effective Detection

TL;DR

This study tackles prosecutable hate speech detection under heterogeneous national laws by creating an expert-annotated dataset across Greece, Italy, and the United Kingdom and evaluating four PLMs alongside two large language models. It investigates law-informed zero-shot and few-shot prompting, as well as leave-one-out cross-validation, and explores the use of LLM-generated silver data to boost performance. The findings show that legal knowledge improves classification in some contexts, but LLM-generated data do not consistently enhance PLM performance, and cross-country differences in law influence inter-annotator agreement and model outcomes. The work provides a valuable cross-country dataset and insight into how legal frameworks interact with NLP models, offering guidance for developing more reliable prosecutable hate speech detectors within diverse legal regimes.

Abstract

The issue of hate speech extends beyond the confines of the online realm. It is a problem with real-life repercussions, prompting most nations to formulate legal frameworks that classify hate speech as a punishable offence. These legal frameworks differ from one country to another, contributing to the big chaos that online platforms have to face when addressing reported instances of hate speech. With the definitions of hate speech falling short in introducing a robust framework, we turn our gaze onto hate speech laws. We consult the opinion of legal experts on a hate speech dataset and we experiment by employing various approaches such as pretrained models both on hate speech and legal data, as well as exploiting two large language models (Qwen2-7B-Instruct and Meta-Llama-3-70B). Due to the time-consuming nature of data acquisition for prosecutable hate speech, we use pseudo-labeling to improve our pretrained models. This study highlights the importance of amplifying research on prosecutable hate speech and provides insights into effective strategies for combating hate speech within the parameters of legal frameworks. Our findings show that legal knowledge in the form of annotations can be useful when classifying prosecutable hate speech, yet more focus should be paid on the differences between the laws.

Paper Structure

This paper contains 36 sections, 1 equation, 25 figures, 14 tables.

Figures (25)

  • Figure 1: Distribution of classes as predicted on 1,000 instances from HateEval basile-etal-2019-semeval. The y axis presents the absolute number of instances. The labels correspond to NP for Not Prosecutable, UP for Unlikely Prosecutable, LP for Likely Prosecutable, and P for Prosecutable. The bar plots illustrate the original binary labels of the instances, distinguishing between hate speech and non-hate speech.
  • Figure 2: HateBERT confusion matrix.
  • Figure 3: DehateBERT confusion matrix.
  • Figure 4: HateRoBERTa confusion matrix.
  • Figure 5: LegalBERT confusion matrix.
  • ...and 20 more figures