Context-Aware Pragmatic Metacognitive Prompting for Sarcasm Detection
Michael Iskandardinata, William Christian, Derwin Suhartono
TL;DR
This work tackles sarcasm detection in NLP by extending Pragmatic Metacognitive Prompting (PMP) with a retrieval-aware context strategy. It integrates non-parametric web retrieval and self-knowledge elicitation to provide targeted background for LLM reasoning, improving performance across culturally varied data. On Indonesian Twitter, SemEval-2018 Task 3, and MUStARD, retrieval grounding yields up to 9.87% macro-F1 gains, and self-knowledge augmentation adds further improvements (3.29% on SemEval and 4.08% on MUStARD). The results demonstrate the crucial role of context—especially for culture-specific slang and references—in boosting sarcasm detection, while highlighting model-specific sensitivity to retrieval quality and prompting configuration. The work contributes a retrieval-augmented, context-aware prompting framework and provides open-source code for reproducibility and future exploration.
Abstract
Detecting sarcasm remains a challenging task in the areas of Natural Language Processing (NLP) despite recent advances in neural network approaches. Currently, Pre-trained Language Models (PLMs) and Large Language Models (LLMs) are the preferred approach for sarcasm detection. However, the complexity of sarcastic text, combined with linguistic diversity and cultural variation across communities, has made the task more difficult even for PLMs and LLMs. Beyond that, those models also exhibit unreliable detection of words or tokens that require extra grounding for analysis. Building on a state-of-the-art prompting method in LLMs for sarcasm detection called Pragmatic Metacognitive Prompting (PMP), we introduce a retrieval-aware approach that incorporates retrieved contextual information for each target text. Our pipeline explores two complementary ways to provide context: adding non-parametric knowledge using web-based retrieval when the model lacks necessary background, and eliciting the model's own internal knowledge for a self-knowledge awareness strategy. We evaluated our approach with three datasets, such as Twitter Indonesia Sarcastic, SemEval-2018 Task 3, and MUStARD. Non-parametric retrieval resulted in a significant 9.87% macro-F1 improvement on Twitter Indonesia Sarcastic compared to the original PMP method. Self-knowledge retrieval improves macro-F1 by 3.29% on Semeval and by 4.08% on MUStARD. These findings highlight the importance of context in enhancing LLMs performance in sarcasm detection task, particularly the involvement of culturally specific slang, references, or unknown terms to the LLMs. Future work will focus on optimizing the retrieval of relevant contextual information and examining how retrieval quality affects performance. The experiment code is available at: https://github.com/wllchrst/sarcasm-detection_pmp_knowledge-base.
