Exploring Cross-Cultural Differences in English Hate Speech Annotations: From Dataset Construction to Analysis

Nayeon Lee; Chani Jung; Junho Myung; Jiho Jin; Jose Camacho-Collados; Juho Kim; Alice Oh

Exploring Cross-Cultural Differences in English Hate Speech Annotations: From Dataset Construction to Analysis

Nayeon Lee, Chani Jung, Junho Myung, Jiho Jin, Jose Camacho-Collados, Juho Kim, Alice Oh

TL;DR

CREHate constructs a cross-cultural English hate speech dataset across AU, GB, SG, US, and ZA to quantify how cultural backgrounds influence hate speech annotation. The two-step process—cultural post collection and cross-cultural annotation—reveals substantial inter-country variation, with only 56.2% unanimous labels and a negative link between cultural distance and agreement ($r=-0.658$, $p=0.039$). Large language models show Anglosphere-biased performance and limited culture-specific adaptation, while cross-cultural training approaches (multi-labeling, multi-task learning, culture tagging) improve cross-country accuracy. The work provides a foundation for culturally aware hate speech detection and highlights practical considerations for dataset design and moderation on global platforms.

Abstract

Warning: this paper contains content that may be offensive or upsetting. Most hate speech datasets neglect the cultural diversity within a single language, resulting in a critical shortcoming in hate speech detection. To address this, we introduce CREHate, a CRoss-cultural English Hate speech dataset. To construct CREHate, we follow a two-step procedure: 1) cultural post collection and 2) cross-cultural annotation. We sample posts from the SBIC dataset, which predominantly represents North America, and collect posts from four geographically diverse English-speaking countries (Australia, United Kingdom, Singapore, and South Africa) using culturally hateful keywords we retrieve from our survey. Annotations are collected from the four countries plus the United States to establish representative labels for each country. Our analysis highlights statistically significant disparities across countries in hate speech annotations. Only 56.2% of the posts in CREHate achieve consensus among all countries, with the highest pairwise label difference rate of 26%. Qualitative analysis shows that label disagreement occurs mostly due to different interpretations of sarcasm and the personal bias of annotators on divisive topics. Lastly, we evaluate large language models (LLMs) under a zero-shot setting and show that current LLMs tend to show higher accuracies on Anglosphere country labels in CREHate. Our dataset and codes are available at: https://github.com/nlee0212/CREHate

Exploring Cross-Cultural Differences in English Hate Speech Annotations: From Dataset Construction to Analysis

TL;DR

). Large language models show Anglosphere-biased performance and limited culture-specific adaptation, while cross-cultural training approaches (multi-labeling, multi-task learning, culture tagging) improve cross-country accuracy. The work provides a foundation for culturally aware hate speech detection and highlights practical considerations for dataset design and moderation on global platforms.

Abstract

Paper Structure (56 sections, 7 figures, 18 tables)

This paper contains 56 sections, 7 figures, 18 tables.

Introduction
Related Work
Dataset Construction
CREHate Post Collection
Sampling from SBIC
Collecting Cultural Samples
Cross-Cultural Annotation
Analysis on the Annotations
Significance of Cultural Backgrounds
Label Agreement among Countries
Annotators' Disagreement Analysis
Comparison between CC-SBIC and CP.
Experiments
Zero-shot Predictions and Country Labels
Culture-Specific Hate Speech Classification
...and 41 more sections

Figures (7)

Figure 1: Illustration of the two-step procedure of CREHate construction: 1) cultural post collection and 2) cross-cultural annotation. The examples show how annotations on identical posts differ across countries.
Figure 2: (a) Pairwise label agreements across countries ordered by the average agreement with others. Labels from Singapore tend to be the most different. (b) Comparison of the label agreements among country pairs and random ones. The histogram and its density function show the distribution of pairwise label agreements among randomly selected annotator groups. The solid lines indicate country pairs with top-2 and bottom-2 label agreement scores, and the dashed line indicates the average of label agreements of all country pairs. Countries that are closely related exhibit high label agreements compared to the random annotator groups, whereas culturally distant countries show significantly low label agreements compared to label agreements from random annotator groups.
Figure 3: Ratio of disagreement reasons within posts. Differing interpretations of sarcasm and personal bias on divisive topics contribute to the main factors of disagreement.
Figure 4: Disagreement reason count for CC-SBIC and CP posts.
Figure 5: Disclaimer and instruction shown to the annotators.
...and 2 more figures

Exploring Cross-Cultural Differences in English Hate Speech Annotations: From Dataset Construction to Analysis

TL;DR

Abstract

Exploring Cross-Cultural Differences in English Hate Speech Annotations: From Dataset Construction to Analysis

Authors

TL;DR

Abstract

Table of Contents

Figures (7)