Read as You See: Guiding Unimodal LLMs for Low-Resource Explainable Harmful Meme Detection
Fengjun Pan, Xiaobao Wu, Tho Quan, Anh Tuan Luu
TL;DR
This work addresses harmful meme detection under low-resource constraints by converting multimodal memes into high-fidelity textual descriptions $D_h$ using a High-Fidelity Meme2Text pipeline that leverages lightweight LMMs, allowing unimodal LLMs to reason on text. It then applies Unimodal Guided CoT Prompting with human-crafted guidelines to produce transparent classifications and rationales, enabling adaptable, context-sensitive moderation. Across seven benchmark datasets, U-CoT+ achieves competitive zero-shot performance relative to resource-intensive baselines, often matching or surpassing GPT-4o-mini, while offering improved explainability and efficiency. The framework thus provides a scalable, adaptable approach to explainable harmful meme detection suitable for low-resource deployment and real-world moderation.
Abstract
Detecting harmful memes is crucial for safeguarding the integrity and harmony of online environments, yet existing detection methods are often resource-intensive, inflexible, and lacking explainability, limiting their applicability in assisting real-world web content moderation. We propose U-CoT+, a resource-efficient framework that prioritizes accessibility, flexibility and transparency in harmful meme detection by fully harnessing the capabilities of lightweight unimodal large language models (LLMs). Instead of directly prompting or fine-tuning large multimodal models (LMMs) as black-box classifiers, we avoid immediate reasoning over complex visual inputs but decouple meme content recognition from meme harmfulness analysis through a high-fidelity meme-to-text pipeline, which collaborates lightweight LMMs and LLMs to convert multimodal memes into natural language descriptions that preserve critical visual information, thus enabling text-only LLMs to "see" memes by "reading". Grounded in textual inputs, we further guide unimodal LLMs' reasoning under zero-shot Chain-of-Thoughts (CoT) prompting with targeted, interpretable, context-aware, and easily obtained human-crafted guidelines, thus providing accountable step-by-step rationales, while enabling flexible and efficient adaptation to diverse sociocultural criteria of harmfulness. Extensive experiments on seven benchmark datasets show that U-CoT+ achieves performance comparable to resource-intensive baselines, highlighting its effectiveness and potential as a scalable, explainable, and low-resource solution to support harmful meme detection.
