Evolver: Chain-of-Evolution Prompting to Boost Large Multimodal Models for Hateful Meme Detection
Jinfa Huang, Jinsheng Pan, Zhongwei Wan, Hanjia Lyu, Jiebo Luo
TL;DR
Hateful memes evolve rapidly, challenging static detectors. Evolver introduces Chain-of-Evolution prompting that augments Large Multimodal Models with Evolutionary Pair Mining, Evolution Information Extractor, and Contextual Relevance Amplifier to reason about meme evolution. The work provides a zero-shot hateful meme detection benchmark and shows consistent performance gains across FHM, MAMI, and HarM datasets with MMICL and LLaVA-1.5 backbones, while offering interpretable cues about evolution-driven reasoning. Key findings include substantial ACC/AUC improvements and clear ablations validating each component, with practical impact on adaptable, explainable moderation of evolving online content.
Abstract
Recent advances show that two-stream approaches have achieved outstanding performance in hateful meme detection. However, hateful memes constantly evolve as new memes emerge by fusing progressive cultural ideas, making existing methods obsolete or ineffective. In this work, we explore the potential of Large Multimodal Models (LMMs) for hateful meme detection. To this end, we propose Evolver, which incorporates LMMs via Chain-of-Evolution (CoE) Prompting, by integrating the evolution attribute and in-context information of memes. Specifically, Evolver simulates the evolving and expressing process of memes and reasons through LMMs in a step-by-step manner. First, an evolutionary pair mining module retrieves the top-k most similar memes in the external curated meme set with the input meme. Second, an evolutionary information extractor is designed to summarize the semantic regularities between the paired memes for prompting. Finally, a contextual relevance amplifier enhances the in-context hatefulness information to boost the search for evolutionary processes. Extensive experiments on public FHM, MAMI, and HarM datasets show that CoE prompting can be incorporated into existing LMMs to improve their performance. More encouragingly, it can serve as an interpretive tool to promote the understanding of the evolution of social memes. [Homepage] (https://github.com/inFaaa/Evolver)
