Table of Contents
Fetching ...

Unveiling Key Aspects of Fine-Tuning in Sentence Embeddings: A Representation Rank Analysis

Euna Jung, Jaeill Kim, Jungmin Ko, Jinwoo Park, Wonjong Rhee

TL;DR

This work introduces representation rank as a rigorous lens to analyze contrastive-learning–based fine-tuning of sentence embeddings and defines two training phases by the peak rank. It reveals strong phase-dependent relationships between rank, alignment, uniformity, linguistic abilities, and STS performance, and proposes Rank Reduction (RR) to actively regularize rank. Across five state-of-the-art CL-based models, RR improves STS performance and stabilizes training, often speeding up convergence and reducing seed variance. The findings offer a practical, low-cost knob for enhancing unsupervised sentence embeddings and invite theoretical work to explain the rank–performance link.

Abstract

The latest advancements in unsupervised learning of sentence embeddings predominantly involve employing contrastive learning-based (CL-based) fine-tuning over pre-trained language models. In this study, we analyze the latest sentence embedding methods by adopting representation rank as the primary tool of analysis. We first define Phase 1 and Phase 2 of fine-tuning based on when representation rank peaks. Utilizing these phases, we conduct a thorough analysis and obtain essential findings across key aspects, including alignment and uniformity, linguistic abilities, and correlation between performance and rank. For instance, we find that the dynamics of the key aspects can undergo significant changes as fine-tuning transitions from Phase 1 to Phase 2. Based on these findings, we experiment with a rank reduction (RR) strategy that facilitates rapid and stable fine-tuning of the latest CL-based methods. Through empirical investigations, we showcase the efficacy of RR in enhancing the performance and stability of five state-of-the-art sentence embedding methods.

Unveiling Key Aspects of Fine-Tuning in Sentence Embeddings: A Representation Rank Analysis

TL;DR

This work introduces representation rank as a rigorous lens to analyze contrastive-learning–based fine-tuning of sentence embeddings and defines two training phases by the peak rank. It reveals strong phase-dependent relationships between rank, alignment, uniformity, linguistic abilities, and STS performance, and proposes Rank Reduction (RR) to actively regularize rank. Across five state-of-the-art CL-based models, RR improves STS performance and stabilizes training, often speeding up convergence and reducing seed variance. The findings offer a practical, low-cost knob for enhancing unsupervised sentence embeddings and invite theoretical work to explain the rank–performance link.

Abstract

The latest advancements in unsupervised learning of sentence embeddings predominantly involve employing contrastive learning-based (CL-based) fine-tuning over pre-trained language models. In this study, we analyze the latest sentence embedding methods by adopting representation rank as the primary tool of analysis. We first define Phase 1 and Phase 2 of fine-tuning based on when representation rank peaks. Utilizing these phases, we conduct a thorough analysis and obtain essential findings across key aspects, including alignment and uniformity, linguistic abilities, and correlation between performance and rank. For instance, we find that the dynamics of the key aspects can undergo significant changes as fine-tuning transitions from Phase 1 to Phase 2. Based on these findings, we experiment with a rank reduction (RR) strategy that facilitates rapid and stable fine-tuning of the latest CL-based methods. Through empirical investigations, we showcase the efficacy of RR in enhancing the performance and stability of five state-of-the-art sentence embedding methods.
Paper Structure (26 sections, 4 equations, 7 figures, 8 tables)

This paper contains 26 sections, 4 equations, 7 figures, 8 tables.

Figures (7)

  • Figure 1: Representation rank and Semantic Textual Similarity (STS) performance of various CL-based models. Blue dots depict the original models and the red dots represent the same models fine-tuned with Rank Reduction (RR) regularizer.
  • Figure 2: Training dynamics for fine-tuning BERT with contrastive learning. Phase 1 and Phase 2 represent two distinct stages of fine-tuning, delineated by the peak in representation rank.
  • Figure 3: Training dynamics of alignment loss and uniformity loss. Initially, alignment loss starts low near the bottom of the y-axis, while uniformity loss begins high near the top. However, both experience a sharp change at the outset, resulting in a significant overlap between the curves.
  • Figure 4: Training dynamics of ten linguistic abilities. We categorize the ten tasks into three groups based on the trends in probing performance.
  • Figure 5: Scatter plot of rank and STS performance in two Phases. We experimented with SimCSE-BERT$_\texttt{base}$ and recorded validation performance and rank every 5 steps during the training. The Pearson correlation is $0.85$ in Phase 1 and $-0.81$ in Phase 2.
  • ...and 2 more figures