Table of Contents
Fetching ...

reCSE: Portable Reshaping Features for Sentence Embedding in Self-supervised Contrastive Learning

Fufangchen Zhao, Jian Gao, Danfeng Yan

TL;DR

reCSE introduces feature reshaping to improve self-supervised sentence embeddings without adding supplementary samples, addressing polarity and memory concerns in augmentation-heavy methods. By decoupling the reshaping as a pendant module, it achieves memory efficiency while delivering competitive semantic similarity performance. Empirical results on STS benchmarks show reCSE closely matches or surpasses baselines and transfers gains to other frameworks, demonstrating portability. This work suggests that feature-level reshaping can provide a universal, memory-friendly alternative for high-quality sentence representations in contrastive learning.

Abstract

We propose reCSE, a self supervised contrastive learning sentence representation framework based on feature reshaping. This framework is different from the current advanced models that use discrete data augmentation methods, but instead reshapes the input features of the original sentence, aggregates the global information of each token in the sentence, and alleviates the common problems of representation polarity and GPU memory consumption linear increase in current advanced models. In addition, our reCSE has achieved competitive performance in semantic similarity tasks. And the experiment proves that our proposed feature reshaping method has strong universality, which can be transplanted to other self supervised contrastive learning frameworks and enhance their representation ability, even achieving state-of-the-art performance. Our code is available at https://github.com/heavenhellchen/reCSE.

reCSE: Portable Reshaping Features for Sentence Embedding in Self-supervised Contrastive Learning

TL;DR

reCSE introduces feature reshaping to improve self-supervised sentence embeddings without adding supplementary samples, addressing polarity and memory concerns in augmentation-heavy methods. By decoupling the reshaping as a pendant module, it achieves memory efficiency while delivering competitive semantic similarity performance. Empirical results on STS benchmarks show reCSE closely matches or surpasses baselines and transfers gains to other frameworks, demonstrating portability. This work suggests that feature-level reshaping can provide a universal, memory-friendly alternative for high-quality sentence representations in contrastive learning.

Abstract

We propose reCSE, a self supervised contrastive learning sentence representation framework based on feature reshaping. This framework is different from the current advanced models that use discrete data augmentation methods, but instead reshapes the input features of the original sentence, aggregates the global information of each token in the sentence, and alleviates the common problems of representation polarity and GPU memory consumption linear increase in current advanced models. In addition, our reCSE has achieved competitive performance in semantic similarity tasks. And the experiment proves that our proposed feature reshaping method has strong universality, which can be transplanted to other self supervised contrastive learning frameworks and enhance their representation ability, even achieving state-of-the-art performance. Our code is available at https://github.com/heavenhellchen/reCSE.
Paper Structure (18 sections, 13 equations, 5 figures, 2 tables)

This paper contains 18 sections, 13 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: The distribution of representation polarity test results. The distribution of the framework (b, c) based on discrete data augmentation shows polarity (concavity), and the distribution of the basic SimCSE and our reCSE (a, d) is relatively uniform.
  • Figure 2: The impact of discrete data augmentation on GPU memory consumption. The y-axis scale is measured in RTX3090 (24GB) units. As more types of additional samples are introduced, the GPU memory consumption for training also increases linearly.
  • Figure 3: The main framework of reCSE. We adopt a modular design to reduce GPU memory consumption
  • Figure 4: The original input features are based solely on a single token (a), while the reshaped features contain the global information of each token in the sentence (b).
  • Figure 5: Test results of reCSE on GPU memory consumption.