Contextualizing Internet Memes Across Social Media Platforms
Saurav Joshi, Filip Ilievski, Luca Luceri
TL;DR
This work proposes a knowledge-grounded framework for contextualizing internet memes across platforms by grounding image memes from Reddit and Discord to the Internet Meme Knowledge Graph (IMKG) using an ETL-driven data lake, Vision Transformer embeddings, and FAISS-based similarity. It answers four research questions about mapping memes to IMKG, their prevalence and popularity on the two platforms, and the value of IMKG in enhancing meme understanding, achieving high precision (approximately $0.93$–$0.95$ at a threshold of $t=0.60$) and identifying thousands of IMKG-aligned memes. The methodology yields cross-platform insights such as shared top memes (e.g., Drake-Hotline-Bling) and demonstrates how grounding links memes to templates, media frames, and encyclopedic knowledge from KnowYourMeme and Wikidata, enriching interpretation and enabling contextual analysis. Limitations include reliance on IMKG coverage for newly emerged memes, potential biases in meme selection, and the absence of textual multimodal context in grounding, pointing to future work on expanding coverage and integrating additional modalities. Overall, the approach offers a scalable pathway to systematically study meme ecosystems and supports downstream applications in moderation, cultural analytics, and platform understanding.
Abstract
Internet memes have emerged as a novel format for communication and expressing ideas on the web. Their fluidity and creative nature are reflected in their widespread use, often across platforms and occasionally for unethical or harmful purposes. While computational work has already analyzed their high-level virality over time and developed specialized classifiers for hate speech detection, there have been no efforts to date that aim to holistically track, identify, and map internet memes posted on social media. To bridge this gap, we investigate whether internet memes across social media platforms can be contextualized by using a semantic repository of knowledge, namely, a knowledge graph. We collect thousands of potential internet meme posts from two social media platforms, namely Reddit and Discord, and develop an extract-transform-load procedure to create a data lake with candidate meme posts. By using vision transformer-based similarity, we match these candidates against the memes cataloged in IMKG -- a recently released knowledge graph of internet memes. We leverage this grounding to highlight the potential of our proposed framework to study the prevalence of memes on different platforms, map them to IMKG, and provide context about memes on social media.
