Unpacking Hateful Memes: Presupposed Context and False Claims
Weibin Cai, Jiayu Li, Reza Zafarani
TL;DR
This work identifies presupposed context and false claims as the core expressive mechanisms of hateful memes and builds SHIELD, a framework that unifies a Presupposed Context Module (PCM) with a False Claims Module (FACT). PCM encodes intra-modal context and fuses cross-modal cues to produce a context embedding, while FACT combines a Social Perception Module (SPM) leveraging external knowledge via a fine-tuned LLM and a Cross-modal Reference Module (CRM) that constructs a cross-modal reference graph processed by a GNN to yield a reference embedding. The classifier then concatenates PCM, SPM, and CRM representations to detect hate, with theoretical analysis of the reference graph’s discriminative properties. Empirically, SHIELD outperforms strong baselines on three hateful meme datasets and proves versatile by transferring to fake-news classification, demonstrating robust generalization across domains. The work offers a theory-grounded approach to hate detection that integrates philosophical and psychological insights with multimodal learning to address societal harms from hateful memes.
Abstract
While memes are often humorous, they are frequently used to disseminate hate, causing serious harm to individuals and society. Current approaches to hateful meme detection mainly rely on pre-trained language models. However, less focus has been dedicated to \textit{what make a meme hateful}. Drawing on insights from philosophy and psychology, we argue that hateful memes are characterized by two essential features: a \textbf{presupposed context} and the expression of \textbf{false claims}. To capture presupposed context, we develop \textbf{PCM} for modeling contextual information across modalities. To detect false claims, we introduce the \textbf{FACT} module, which integrates external knowledge and harnesses cross-modal reference graphs. By combining PCM and FACT, we introduce \textbf{\textsf{SHIELD}}, a hateful meme detection framework designed to capture the fundamental nature of hate. Extensive experiments show that SHIELD outperforms state-of-the-art methods across datasets and metrics, while demonstrating versatility on other tasks, such as fake news detection.
