Irony in Emojis: A Comparative Study of Human and LLM Interpretation

Yawen Zheng; Hanjia Lyu; Jiebo Luo

Irony in Emojis: A Comparative Study of Human and LLM Interpretation

Yawen Zheng, Hanjia Lyu, Jiebo Luo

TL;DR

This paper compares GPT-4o's interpretation of irony in emojis to human perception using the Ciron Weibo dataset, examining how age and gender prompts affect irony judgments. Humans rate irony on posts containing emojis, while GPT-4o rates the likelihood that an emoji expresses irony, enabling a direct comparison via statistical tests. The results show GPT-4o generally assigns higher irony potential than humans, with only modest alignment and clear age effects, highlighting model biases and contextual dependencies. The work underscores the need for cross-cultural and demographic-aware evaluation of emoji interpretation in AI systems for sentiment analysis and conversational agents.

Abstract

Emojis have become a universal language in online communication, often carrying nuanced and context-dependent meanings. Among these, irony poses a significant challenge for Large Language Models (LLMs) due to its inherent incongruity between appearance and intent. This study examines the ability of GPT-4o to interpret irony in emojis. By prompting GPT-4o to evaluate the likelihood of specific emojis being used to express irony on social media and comparing its interpretations with human perceptions, we aim to bridge the gap between machine and human understanding. Our findings reveal nuanced insights into GPT-4o's interpretive capabilities, highlighting areas of alignment with and divergence from human behavior. Additionally, this research underscores the importance of demographic factors, such as age and gender, in shaping emoji interpretation and evaluates how these factors influence GPT-4o's performance.

Irony in Emojis: A Comparative Study of Human and LLM Interpretation

TL;DR

Abstract

Paper Structure (12 sections, 2 equations, 2 figures, 2 tables)

This paper contains 12 sections, 2 equations, 2 figures, 2 tables.

Introduction
Related Work
Method
Human Perception of Emoji Irony
GPT-4o's Classification of Emoji Irony
Prompt Design
Experiment Setting
Results
Prompts with Demographic Information
Discussions and Conclusions
Appendix
Further Discussion on Potential Broader Impact and Ethical Considerations

Figures (2)

Figure 1: While GPT-4o generally rates the same emoji as more likely to be used for expressing irony compared to human perception, the irony scores assigned by GPT-4o and those perceived by humans show a significant correlation. Emojis positioned closer to the dashed line indicate greater alignment between GPT-4o's classification of their use for expressing irony and human perception.
Figure 2: When the prompt includes demographic information, no significant differences in irony scores are observed between prompts specifying female or male gender. However, the irony scores tend to decrease on average as the specified age in the prompt increases.

Irony in Emojis: A Comparative Study of Human and LLM Interpretation

TL;DR

Abstract

Irony in Emojis: A Comparative Study of Human and LLM Interpretation

Authors

TL;DR

Abstract

Table of Contents

Figures (2)