Text or Image? What is More Important in Cross-Domain Generalization Capabilities of Hate Meme Detection Models?

Piush Aggarwal; Jawar Mehrabanian; Weigang Huang; Özge Alacam; Torsten Zesch

Text or Image? What is More Important in Cross-Domain Generalization Capabilities of Hate Meme Detection Models?

Piush Aggarwal, Jawar Mehrabanian, Weigang Huang, Özge Alacam, Torsten Zesch

TL;DR

The paper investigates cross-domain generalization in multimodal hate meme detection and finds that the textual component largely drives generalization, while the image component is highly sensitive to training data. By applying Shapley-value analyses, the authors quantify modality contributions, observing a text dominance around $0.83$ (83%) that falls to $0.52$ (52%) when image captions are included. They show that text-only classifiers can match or exceed multimodal models in zero-shot cross-domain settings and that captions can both aid and hinder performance depending on the dataset and setup. A confounder-dataset study further reveals that models are more susceptible to text-based confounders, underscoring the need to improve image modality integration and to design evaluation protocols that better separate modality effects.

Abstract

This paper delves into the formidable challenge of cross-domain generalization in multimodal hate meme detection, presenting compelling findings. We provide enough pieces of evidence supporting the hypothesis that only the textual component of hateful memes enables the existing multimodal classifier to generalize across different domains, while the image component proves highly sensitive to a specific training dataset. The evidence includes demonstrations showing that hate-text classifiers perform similarly to hate-meme classifiers in a zero-shot setting. Simultaneously, the introduction of captions generated from images of memes to the hate-meme classifier worsens performance by an average F1 of 0.02. Through blackbox explanations, we identify a substantial contribution of the text modality (average of 83%), which diminishes with the introduction of meme's image captions (52%). Additionally, our evaluation on a newly created confounder dataset reveals higher performance on text confounders as compared to image confounders with an average $Δ$F1 of 0.18.

Text or Image? What is More Important in Cross-Domain Generalization Capabilities of Hate Meme Detection Models?

TL;DR

(83%) that falls to

(52%) when image captions are included. They show that text-only classifiers can match or exceed multimodal models in zero-shot cross-domain settings and that captions can both aid and hinder performance depending on the dataset and setup. A confounder-dataset study further reveals that models are more susceptible to text-based confounders, underscoring the need to improve image modality integration and to design evaluation protocols that better separate modality effects.

Abstract

F1 of 0.18.

Paper Structure (27 sections, 4 figures, 7 tables)

This paper contains 27 sections, 4 figures, 7 tables.

Introduction
Related Work
Modality Contribution with Shapley Values
Datasets
Hateful Meme Datasets
Kiela et al.
Pramanick et al.
Fersini et al.
Confounder Dataset
Annotation process
Experimental Models
Unimodal Hate Recognition
Multimodal Hate Recognition
Image Caption Generation
Experiments & Results
...and 12 more sections

Figures (4)

Figure 1: Illustration of our experimental arrangement for assessing the hate meme model's performance compared to unimodal text-based hate classifiers. The evaluation involves a test meme from a domain not included in the model's training data.
Figure 2: A schematic showing the data collection process of our proposed dataset.
Figure 3: Example of Modality Contrbution of Rob+Resnet based hate meme detection model when trained on different hateful meme datasets. Here notation (+C) refers additional caption used in model training. RED and GREEN colour illustrate low and high contribution respectively.
Figure 4: Pdf preview of instruction manual.

Text or Image? What is More Important in Cross-Domain Generalization Capabilities of Hate Meme Detection Models?

TL;DR

Abstract

Text or Image? What is More Important in Cross-Domain Generalization Capabilities of Hate Meme Detection Models?

Authors

TL;DR

Abstract

Table of Contents

Figures (4)