Table of Contents
Fetching ...

Evolver: Chain-of-Evolution Prompting to Boost Large Multimodal Models for Hateful Meme Detection

Jinfa Huang, Jinsheng Pan, Zhongwei Wan, Hanjia Lyu, Jiebo Luo

TL;DR

Hateful memes evolve rapidly, challenging static detectors. Evolver introduces Chain-of-Evolution prompting that augments Large Multimodal Models with Evolutionary Pair Mining, Evolution Information Extractor, and Contextual Relevance Amplifier to reason about meme evolution. The work provides a zero-shot hateful meme detection benchmark and shows consistent performance gains across FHM, MAMI, and HarM datasets with MMICL and LLaVA-1.5 backbones, while offering interpretable cues about evolution-driven reasoning. Key findings include substantial ACC/AUC improvements and clear ablations validating each component, with practical impact on adaptable, explainable moderation of evolving online content.

Abstract

Recent advances show that two-stream approaches have achieved outstanding performance in hateful meme detection. However, hateful memes constantly evolve as new memes emerge by fusing progressive cultural ideas, making existing methods obsolete or ineffective. In this work, we explore the potential of Large Multimodal Models (LMMs) for hateful meme detection. To this end, we propose Evolver, which incorporates LMMs via Chain-of-Evolution (CoE) Prompting, by integrating the evolution attribute and in-context information of memes. Specifically, Evolver simulates the evolving and expressing process of memes and reasons through LMMs in a step-by-step manner. First, an evolutionary pair mining module retrieves the top-k most similar memes in the external curated meme set with the input meme. Second, an evolutionary information extractor is designed to summarize the semantic regularities between the paired memes for prompting. Finally, a contextual relevance amplifier enhances the in-context hatefulness information to boost the search for evolutionary processes. Extensive experiments on public FHM, MAMI, and HarM datasets show that CoE prompting can be incorporated into existing LMMs to improve their performance. More encouragingly, it can serve as an interpretive tool to promote the understanding of the evolution of social memes. [Homepage] (https://github.com/inFaaa/Evolver)

Evolver: Chain-of-Evolution Prompting to Boost Large Multimodal Models for Hateful Meme Detection

TL;DR

Hateful memes evolve rapidly, challenging static detectors. Evolver introduces Chain-of-Evolution prompting that augments Large Multimodal Models with Evolutionary Pair Mining, Evolution Information Extractor, and Contextual Relevance Amplifier to reason about meme evolution. The work provides a zero-shot hateful meme detection benchmark and shows consistent performance gains across FHM, MAMI, and HarM datasets with MMICL and LLaVA-1.5 backbones, while offering interpretable cues about evolution-driven reasoning. Key findings include substantial ACC/AUC improvements and clear ablations validating each component, with practical impact on adaptable, explainable moderation of evolving online content.

Abstract

Recent advances show that two-stream approaches have achieved outstanding performance in hateful meme detection. However, hateful memes constantly evolve as new memes emerge by fusing progressive cultural ideas, making existing methods obsolete or ineffective. In this work, we explore the potential of Large Multimodal Models (LMMs) for hateful meme detection. To this end, we propose Evolver, which incorporates LMMs via Chain-of-Evolution (CoE) Prompting, by integrating the evolution attribute and in-context information of memes. Specifically, Evolver simulates the evolving and expressing process of memes and reasons through LMMs in a step-by-step manner. First, an evolutionary pair mining module retrieves the top-k most similar memes in the external curated meme set with the input meme. Second, an evolutionary information extractor is designed to summarize the semantic regularities between the paired memes for prompting. Finally, a contextual relevance amplifier enhances the in-context hatefulness information to boost the search for evolutionary processes. Extensive experiments on public FHM, MAMI, and HarM datasets show that CoE prompting can be incorporated into existing LMMs to improve their performance. More encouragingly, it can serve as an interpretive tool to promote the understanding of the evolution of social memes. [Homepage] (https://github.com/inFaaa/Evolver)
Paper Structure (17 sections, 6 equations, 5 figures, 10 tables)

This paper contains 17 sections, 6 equations, 5 figures, 10 tables.

Figures (5)

  • Figure 1: The illustration of (a) the evolution of memes and comparison between (b) conventional two-stream methods, and (c) our Evolver method. Memes evolve by fusing new cultural concepts. The meme of Trump is influenced by the meme of a sad frog in an image and text symbol, which creates a new hateful meme. Conventional hateful meme detection methods use trainable two-stream encoders and fusion for meme classification, with poor interpretability. In contrast, our Evolver captures the evolution and context of memes, utilizing them as prompts for large multimodal models to obtain a comprehensive understanding of memes.
  • Figure 2: Ablation study of the three components of our method on FHM dataset.
  • Figure 3: Example results of the Evolver (Ours) and the baseline model (MMICL). For more examples refer to the Appendix.
  • Figure 4: Effect of the number of evolutionary memes.
  • Figure 5: Example results of the Evolver (Ours) and the baseline model (MMICL).