Continuous Adversarial Text Representation Learning for Affective Recognition
Seungah Son, Andrez Saurez, Dongsoo Har
TL;DR
This work tackles learning nuanced affective representations in transformer models by introducing CARL, a framework that jointly leverages momentum continuous-label contrastive learning and gradient-based perturbed token detection guided by a two-dimensional valence-arousal affect space. By aligning sentence-level embeddings with continuous affect labels and enforcing token-level emotional sensitivity via dynamic perturbations, CARL achieves improved valence/arousal prediction, polarity classification, and emotion recognition while enhancing embedding alignment and uniformity. The method demonstrates robust gains across multiple datasets (EmoBank, IEMOCAP, FacebookPosts, EmoTales) and with two backbones (BERT-base and RoBERTa-base), including up to 15.5% accuracy improvements on emotion classification. The results highlight the importance of continuous affect labels and gradient-guided token perturbations for fine-grained affective understanding in NLP, with practical implications for affective computing and human-computer interaction.
Abstract
While pre-trained language models excel at semantic understanding, they often struggle to capture nuanced affective information critical for affective recognition tasks. To address these limitations, we propose a novel framework for enhancing emotion-aware embeddings in transformer-based models. Our approach introduces a continuous valence-arousal labeling system to guide contrastive learning, which captures subtle and multi-dimensional emotional nuances more effectively. Furthermore, we employ a dynamic token perturbation mechanism, using gradient-based saliency to focus on sentiment-relevant tokens, improving model sensitivity to emotional cues. The experimental results demonstrate that the proposed framework outperforms existing methods, achieving up to 15.5% improvement in the emotion classification benchmark, highlighting the importance of employing continuous labels. This improvement demonstrates that the proposed framework is effective in affective representation learning and enables precise and contextually relevant emotional understanding.
