Table of Contents
Fetching ...

SoftMCL: Soft Momentum Contrastive Learning for Fine-grained Sentiment-aware Pre-training

Jin Wang, Liang-Chih Yu, Xuejie Zhang

TL;DR

SoftMCL tackles the challenge of encoding fine-grained sentiment information into pre-trained language models by replacing hard sentiment labels with continuous valence ratings as supervision. It integrates a momentum queue to enlarge the pool of negatives and performs both word- and sentence-level contrastive learning, guided by valence-aware similarities. The approach yields improvements across multiple sentiment-related tasks over strong baselines, supported by thorough ablation and parameter analyses. This method offers a practical path to richer affective representations with scalable training, and points toward extending sentiment-awareness into generative decoding models.

Abstract

The pre-training for language models captures general language understanding but fails to distinguish the affective impact of a particular context to a specific word. Recent works have sought to introduce contrastive learning (CL) for sentiment-aware pre-training in acquiring affective information. Nevertheless, these methods present two significant limitations. First, the compatibility of the GPU memory often limits the number of negative samples, hindering the opportunities to learn good representations. In addition, using only a few sentiment polarities as hard labels, e.g., positive, neutral, and negative, to supervise CL will force all representations to converge to a few points, leading to the issue of latent space collapse. This study proposes a soft momentum contrastive learning (SoftMCL) for fine-grained sentiment-aware pre-training. Instead of hard labels, we introduce valence ratings as soft-label supervision for CL to fine-grained measure the sentiment similarities between samples. The proposed SoftMCL is conducted on both the word- and sentence-level to enhance the model's ability to learn affective information. A momentum queue was introduced to expand the contrastive samples, allowing storing and involving more negatives to overcome the limitations of hardware platforms. Extensive experiments were conducted on four different sentiment-related tasks, which demonstrates the effectiveness of the proposed SoftMCL method. The code and data of the proposed SoftMCL is available at: https://www.github.com/wangjin0818/SoftMCL/.

SoftMCL: Soft Momentum Contrastive Learning for Fine-grained Sentiment-aware Pre-training

TL;DR

SoftMCL tackles the challenge of encoding fine-grained sentiment information into pre-trained language models by replacing hard sentiment labels with continuous valence ratings as supervision. It integrates a momentum queue to enlarge the pool of negatives and performs both word- and sentence-level contrastive learning, guided by valence-aware similarities. The approach yields improvements across multiple sentiment-related tasks over strong baselines, supported by thorough ablation and parameter analyses. This method offers a practical path to richer affective representations with scalable training, and points toward extending sentiment-awareness into generative decoding models.

Abstract

The pre-training for language models captures general language understanding but fails to distinguish the affective impact of a particular context to a specific word. Recent works have sought to introduce contrastive learning (CL) for sentiment-aware pre-training in acquiring affective information. Nevertheless, these methods present two significant limitations. First, the compatibility of the GPU memory often limits the number of negative samples, hindering the opportunities to learn good representations. In addition, using only a few sentiment polarities as hard labels, e.g., positive, neutral, and negative, to supervise CL will force all representations to converge to a few points, leading to the issue of latent space collapse. This study proposes a soft momentum contrastive learning (SoftMCL) for fine-grained sentiment-aware pre-training. Instead of hard labels, we introduce valence ratings as soft-label supervision for CL to fine-grained measure the sentiment similarities between samples. The proposed SoftMCL is conducted on both the word- and sentence-level to enhance the model's ability to learn affective information. A momentum queue was introduced to expand the contrastive samples, allowing storing and involving more negatives to overcome the limitations of hardware platforms. Extensive experiments were conducted on four different sentiment-related tasks, which demonstrates the effectiveness of the proposed SoftMCL method. The code and data of the proposed SoftMCL is available at: https://www.github.com/wangjin0818/SoftMCL/.
Paper Structure (22 sections, 6 equations, 7 figures, 4 tables)

This paper contains 22 sections, 6 equations, 7 figures, 4 tables.

Figures (7)

  • Figure 1: The conceptual diagram of using different contrastive learning strategies for sentiment-aware pre-training. (a) The self-supervised CL contrasts a single positive for each anchor (i.e., an augmentation of the anchor) against a set of negative consisting of the entire remainder of the batch. (b) The supervised CL contrasts the set of samples with same polarity as positives against the negatives from the remainder of the batch. (c) The proposed SoftMCL introducing external affective supervision to contrast the set of all samples according to the fine-grained distance of valence ratings between samples.
  • Figure 2: The overall architecture of the proposed SoftMCL.
  • Figure 3: The effect of different balance coefficient.
  • Figure 4: The effect of different temperature.
  • Figure 5: The effect of different momentum coefficient.
  • ...and 2 more figures