Table of Contents
Fetching ...

IITK at SemEval-2024 Task 4: Hierarchical Embeddings for Detection of Persuasion Techniques in Memes

Shreenaga Chikoti, Shrey Mehta, Ashutosh Modi

TL;DR

The paper tackles multilingual detection of persuasion techniques in memes (SemEval-2024 Task 4) by coupling HypEmo's hierarchical hyperbolic label embeddings with a class-definition prediction framework, and extending to multimodal inputs via CLIP. It deploys task-specific ensembles across three subtasks: text-only hierarchical multi-label, text+image hierarchical multi-label, and binary text+image detection, achieving hierarchical F1 scores of 0.60, 0.67, and 0.48 respectively. Key findings show that textual cues predominantly drive performance, CLIP-derived visual features offer limited gains except for certain techniques, and a union ensemble of models yields robust results. The work contributes a cohesive multimodal approach for persuasion-detection in memes and provides code for reproducibility.

Abstract

Memes are one of the most popular types of content used in an online disinformation campaign. They are primarily effective on social media platforms since they can easily reach many users. Memes in a disinformation campaign achieve their goal of influencing the users through several rhetorical and psychological techniques, such as causal oversimplification, name-calling, and smear. The SemEval 2024 Task 4 \textit{Multilingual Detection of Persuasion Technique in Memes} on identifying such techniques in the memes is divided across three sub-tasks: ($\mathbf{1}$) Hierarchical multi-label classification using only textual content of the meme, ($\mathbf{2}$) Hierarchical multi-label classification using both, textual and visual content of the meme and ($\mathbf{3}$) Binary classification of whether the meme contains a persuasion technique or not using it's textual and visual content. This paper proposes an ensemble of Class Definition Prediction (CDP) and hyperbolic embeddings-based approaches for this task. We enhance meme classification accuracy and comprehensiveness by integrating HypEmo's hierarchical label embeddings (Chen et al., 2023) and a multi-task learning framework for emotion prediction. We achieve a hierarchical F1-score of 0.60, 0.67, and 0.48 on the respective sub-tasks.

IITK at SemEval-2024 Task 4: Hierarchical Embeddings for Detection of Persuasion Techniques in Memes

TL;DR

The paper tackles multilingual detection of persuasion techniques in memes (SemEval-2024 Task 4) by coupling HypEmo's hierarchical hyperbolic label embeddings with a class-definition prediction framework, and extending to multimodal inputs via CLIP. It deploys task-specific ensembles across three subtasks: text-only hierarchical multi-label, text+image hierarchical multi-label, and binary text+image detection, achieving hierarchical F1 scores of 0.60, 0.67, and 0.48 respectively. Key findings show that textual cues predominantly drive performance, CLIP-derived visual features offer limited gains except for certain techniques, and a union ensemble of models yields robust results. The work contributes a cohesive multimodal approach for persuasion-detection in memes and provides code for reproducibility.

Abstract

Memes are one of the most popular types of content used in an online disinformation campaign. They are primarily effective on social media platforms since they can easily reach many users. Memes in a disinformation campaign achieve their goal of influencing the users through several rhetorical and psychological techniques, such as causal oversimplification, name-calling, and smear. The SemEval 2024 Task 4 \textit{Multilingual Detection of Persuasion Technique in Memes} on identifying such techniques in the memes is divided across three sub-tasks: () Hierarchical multi-label classification using only textual content of the meme, () Hierarchical multi-label classification using both, textual and visual content of the meme and () Binary classification of whether the meme contains a persuasion technique or not using it's textual and visual content. This paper proposes an ensemble of Class Definition Prediction (CDP) and hyperbolic embeddings-based approaches for this task. We enhance meme classification accuracy and comprehensiveness by integrating HypEmo's hierarchical label embeddings (Chen et al., 2023) and a multi-task learning framework for emotion prediction. We achieve a hierarchical F1-score of 0.60, 0.67, and 0.48 on the respective sub-tasks.
Paper Structure (13 sections, 2 equations, 7 figures, 10 tables)

This paper contains 13 sections, 2 equations, 7 figures, 10 tables.

Figures (7)

  • Figure 1: Sample set of memes showing the multi-modal setting
  • Figure 2: Taxonomy of persuasion techniques for sub-task $\mathbf{2}$
  • Figure 3: The frequency Distribution of Labels in the training dataset
  • Figure 4: The frequency Distribution of Labels in the validation dataset
  • Figure 5: The meme sarcastically suggests that individuals who oppose Trump are being unfairly equated with terrorists, highlighting the absurdity of such comparisons. Two persuasion techniques are used: (i) Loaded Language, and (ii) Name calling that can be inferred from the text and the visual content.
  • ...and 2 more figures