Is AI Ready for Multimodal Hate Speech Detection? A Comprehensive Dataset and Benchmark Evaluation

Rui Xing; Qi Chai; Jie Ma; Jing Tao; Pinghui Wang; Shuming Zhang; Xinping Wang; Hao Wang

Is AI Ready for Multimodal Hate Speech Detection? A Comprehensive Dataset and Benchmark Evaluation

Rui Xing, Qi Chai, Jie Ma, Jing Tao, Pinghui Wang, Shuming Zhang, Xinping Wang, Hao Wang

Abstract

Hate speech online targets individuals or groups based on identity attributes and spreads rapidly, posing serious social risks. Memes, which combine images and text, have emerged as a nuanced vehicle for disseminating hate speech, often relying on cultural knowledge for interpretation. However, existing multimodal hate speech datasets suffer from coarse-grained labeling and a lack of integration with surrounding discourse, leading to imprecise and incomplete assessments. To bridge this gap, we propose an agentic annotation framework that coordinates seven specialized agents to generate hierarchical labels and rationales. Based on this framework, we construct M^3 (Multi-platform, Multi-lingual, and Multimodal Meme), a dataset of 2,455 memes collected from X, 4chan, and Weibo, featuring fine-grained hate labels and human-verified rationales. Benchmarking state-of-the-art Multimodal Large Language Models reveals that these models struggle to effectively utilize surrounding post context, which often fails to improve or even degrades detection performance. Our finding highlights the challenges these models face in reasoning over memes embedded in real-world discourse and underscores the need for a context-aware multimodal architecture. Our dataset and code are available at https://github.com/mira-ai-lab/M3.

Is AI Ready for Multimodal Hate Speech Detection? A Comprehensive Dataset and Benchmark Evaluation

Abstract

Paper Structure (52 sections, 9 figures, 8 tables)

This paper contains 52 sections, 9 figures, 8 tables.

Introduction
Related Work
Unimodal Hate Speech Datasets
Multimodal Hate Speech Datasets
Agentic Annotation Framework
Data Acquisition
Collector.
Preprocessing
Extractor.
Cleaner.
Hierarchical Annotation
Annotator.
Arbiter.
Explicator.
Quality Assurance
...and 37 more sections

Figures (9)

Figure 1: Comparison between existing datasets and ours (M3). Existing datasets typically label the meme (right) as normal. However, our dataset labels it as hate with a refined classification of healthy state-because of its accompanying post.
Figure 2: The agentic annotation framework for M3. ❶ Acquisition: Collector harvests multi-platform images and metadata. ❷ Preprocessing: Extractor and Cleaner perform OCR and metadata refinement. ❸ Annotation: Annotators, Arbiter, and Explicator collaborate on classification and rationale generation. ❹ Validation: Validator conducts quality assurance to finalize the M3 dataset (sample entry on the right).
Figure 3: Visualizing linguistic patterns in M3. The top panel displays the word cloud of posts in hate samples from X, while the bottom-left and bottom-right panels illustrate the word cloud of Weibo and 4chan, respectively.
Figure 4: Hierarchical categories in M3. Hate samples are categorized into eight themes, with 1,018 single-labeled and 300 multi-labeled samples.
Figure 5: The distribution of single rationale and multiple rationales across X, Weibo, and 4chan.
...and 4 more figures

Is AI Ready for Multimodal Hate Speech Detection? A Comprehensive Dataset and Benchmark Evaluation

Abstract

Is AI Ready for Multimodal Hate Speech Detection? A Comprehensive Dataset and Benchmark Evaluation

Authors

Abstract

Table of Contents

Figures (9)