A Transformer and Prototype-based Interpretable Model for Contextual Sarcasm Detection
Ximing Wen, Rezvaneh Rezapour
TL;DR
This work tackles sarcasm detection by integrating semantic and sentiment embeddings from transformer models with a dual prototype-based architecture to enable intrinsic interpretability. The model uses semantic and sentiment prototype layers, initialized via class-specific clustering, and evaluates similarity with an RDF kernel, producing sentence-level explanations by projecting prototypes onto training samples. An incongruity loss between explicit and implicit sentiment pathways reinforces sarcasm signals, contributing to state-of-the-art performance on three public datasets. Empirical results, qualitative case studies, and an ablation analysis demonstrate that the approach delivers human-readable explanations while maintaining high predictive accuracy, highlighting its practical impact for transparent affective analysis. $sim(\mathbf{e}, \mathbf{p}) = \exp\left(-\frac{\|\mathbf{e}-\mathbf{p}\|_2^2}{\sigma^2}\right)$ is one example of the prototype similarity used to drive the interpretable decisions.
Abstract
Sarcasm detection, with its figurative nature, poses unique challenges for affective systems designed to perform sentiment analysis. While these systems typically perform well at identifying direct expressions of emotion, they struggle with sarcasm's inherent contradiction between literal and intended sentiment. Since transformer-based language models (LMs) are known for their efficient ability to capture contextual meanings, we propose a method that leverages LMs and prototype-based networks, enhanced by sentiment embeddings to conduct interpretable sarcasm detection. Our approach is intrinsically interpretable without extra post-hoc interpretability techniques. We test our model on three public benchmark datasets and show that our model outperforms the current state-of-the-art. At the same time, the prototypical layer enhances the model's inherent interpretability by generating explanations through similar examples in the reference time. Furthermore, we demonstrate the effectiveness of incongruity loss in the ablation study, which we construct using sentiment prototypes.
