Toxic Memes: A Survey of Computational Perspectives on the Detection and Explanation of Meme Toxicities

Delfina Sol Martinez Pandiani; Erik Tjong Kim Sang; Davide Ceolin

Toxic Memes: A Survey of Computational Perspectives on the Detection and Explanation of Meme Toxicities

Delfina Sol Martinez Pandiani, Erik Tjong Kim Sang, Davide Ceolin

TL;DR

This work addresses the rapid growth of computational analyses on toxic memes by delivering an up-to-date, PRISMA-guided survey through early 2024. It synthesizes 158 content-based studies, catalogs 30+ datasets, and proposes a harmonized taxonomy of meme toxicities alongside a three-dimensional framework (target, intent, tactic) to organize toxicity types. The paper highlights key trends—enhanced cross-modal reasoning, background knowledge integration, explainability, multilingual challenges, and the rising role of LLMs—in shaping detection and interpretation. It also outlines practical directions for advancing automatic detection, explanation, and responsible moderation, emphasizing inclusivity across languages and cultures and the ethical implications of automated interventions.

Abstract

Internet memes, channels for humor, social commentary, and cultural expression, are increasingly used to spread toxic messages. Studies on the computational analyses of toxic memes have significantly grown over the past five years, and the only three surveys on computational toxic meme analysis cover only work published until 2022, leading to inconsistent terminology and unexplored trends. Our work fills this gap by surveying content-based computational perspectives on toxic memes, and reviewing key developments until early 2024. Employing the PRISMA methodology, we systematically extend the previously considered papers, achieving a threefold result. First, we survey 119 new papers, analyzing 158 computational works focused on content-based toxic meme analysis. We identify over 30 datasets used in toxic meme analysis and examine their labeling systems. Second, after observing the existence of unclear definitions of meme toxicity in computational works, we introduce a new taxonomy for categorizing meme toxicity types. We also note an expansion in computational tasks beyond the simple binary classification of memes as toxic or non-toxic, indicating a shift towards achieving a nuanced comprehension of toxicity. Third, we identify three content-based dimensions of meme toxicity under automatic study: target, intent, and conveyance tactics. We develop a framework illustrating the relationships between these dimensions and meme toxicities. The survey analyzes key challenges and recent trends, such as enhanced cross-modal reasoning, integrating expert and cultural knowledge, the demand for automatic toxicity explanations, and handling meme toxicity in low-resource languages. Also, it notes the rising use of Large Language Models (LLMs) and generative AI for detecting and generating toxic memes. Finally, it proposes pathways for advancing toxic meme detection and interpretation.

Toxic Memes: A Survey of Computational Perspectives on the Detection and Explanation of Meme Toxicities

TL;DR

Abstract

Paper Structure (61 sections, 12 figures, 8 tables)

This paper contains 61 sections, 12 figures, 8 tables.

Introduction
Background
Defining (Internet) Memes
Defining Online Toxicity
Textual Toxicity
Multimodal Toxicity
Related Work
Surveys on Toxic Memes from Computational Perspective
Other Relevant Surveys On Hate Speech and Disinformation
Need for this Survey
Methodology
Selection of Database
Inclusion of Preprints
Identification of Studies to Review
Database Query
...and 46 more sections

Figures (12)

Figure 1: Graph depicting the exponential increase in publications within the field of computer science, as indexed by SCOPUS, focusing on research related to toxic memes. The data was gathered using a query targeting specific keywords associated with meme toxicities (see Section \ref{['sec:methodology']}).
Figure 2: PRISMA 2020 flow diagram for systematic reviews on *SCOPUS and Web of Science (WOS) databases. Registers refers to SCOPUS preprints. Records excluded due to: ** Topic Non-Relevance. *** Computational Non-Relevance.
Figure 3: Left: Distribution of surveyed papers by publication year. The figure illustrates a steady increase in the number of publications from year to year. Right: Comparison of coverage across previous surveys based on the papers surveyed here.
Figure 4: Top: Pie chart showing the distribution of datasets based on the number of memes they contain. Approximately 40% of datasets contain between 1000 to 5000 memes, followed by nearly 30% with 5000 to 10,000 memes. Less than 3% of the datasets have over 15,000 memes. Middle: Pie chart illustrating the distribution of dataset languages. English dominates with nearly 75% of the datasets exclusively in English, while the remaining datasets include Hindi, Bengali, Tamil, and code-mixed memes. Bottom: Pie chart displaying the sources of memes in the datasets. Social media platforms contribute the most (blue), followed by meme-specific sources (green), search engines (yellow), and image-hosting platforms (red). Over half of the datasets include memes from social media.
Figure 5: Proportion of papers focused on different meme toxicity categories, based on authors' explicit labels.
...and 7 more figures

Toxic Memes: A Survey of Computational Perspectives on the Detection and Explanation of Meme Toxicities

TL;DR

Abstract

Toxic Memes: A Survey of Computational Perspectives on the Detection and Explanation of Meme Toxicities

Authors

TL;DR

Abstract

Table of Contents

Figures (12)