Improving Text Embeddings for Smaller Language Models Using Contrastive Fine-tuning

Trapoom Ukarapol; Zhicheng Lee; Amy Xin

Improving Text Embeddings for Smaller Language Models Using Contrastive Fine-tuning

Trapoom Ukarapol, Zhicheng Lee, Amy Xin

TL;DR

The paper targets the embedding quality of small language models by applying contrastive fine-tuning on an NLI corpus using the InfoNCE objective with hard negatives and LoRA for efficiency. It benchmarks Gemma, Phi-2, and MiniCPM on nine STS datasets, finding MiniCPM achieves the highest average Spearman correlation ($\$83.84\% \pm 4.27$) and exhibits substantial gains (e.g., $56.33\%$ average improvement) after fine-tuning. The approach uses an EOS-based embedding extraction and shows that efficient, task-aligned fine-tuning can close the gap between small and large models for semantic similarity tasks. The work provides public code, demonstrating practical viability for deployable, resource-conscious embedding improvements in smaller LMs.

Abstract

While Large Language Models show remarkable performance in natural language understanding, their resource-intensive nature makes them less accessible. In contrast, smaller language models such as MiniCPM offer more sustainable scalability, but often underperform without specialized optimization. In this paper, we explore the enhancement of smaller language models through the improvement of their text embeddings. We select three language models, MiniCPM, Phi-2, and Gemma, to conduct contrastive fine-tuning on the NLI dataset. Our results demonstrate that this fine-tuning method enhances the quality of text embeddings for all three models across various benchmarks, with MiniCPM showing the most significant improvements of an average 56.33% performance gain. The contrastive fine-tuning code is publicly available at https://github.com/trapoom555/Language-Model-STS-CFT.

Improving Text Embeddings for Smaller Language Models Using Contrastive Fine-tuning

TL;DR

83.84\% \pm 4.27

56.33\%$ average improvement) after fine-tuning. The approach uses an EOS-based embedding extraction and shows that efficient, task-aligned fine-tuning can close the gap between small and large models for semantic similarity tasks. The work provides public code, demonstrating practical viability for deployable, resource-conscious embedding improvements in smaller LMs.

Abstract

Paper Structure (26 sections, 2 equations, 2 figures, 6 tables)

This paper contains 26 sections, 2 equations, 2 figures, 6 tables.

Introduction
Related Works
Text Embeddings
Contrastive Representation Learning
Lightweight LLMs
LLM Fine-tuning
Methodology
Dataset
Language Model Choices
Contrastive Fine-tuning
Embedding Vector Extraction
Training Objective
Experiments
STS12, STS13, STS14, STS15, STS16, STS17, STSBenchmark
BIOSSES
...and 11 more sections

Figures (2)

Figure 1: Average performance of model checkpoints across all 9 test sets. The values are the average spearman correlations of cosine similarities across all 9 test sets. The model converged after checkpoint 200.
Figure 2: Fine-tuning loss over training steps.

Improving Text Embeddings for Smaller Language Models Using Contrastive Fine-tuning

TL;DR

Abstract

Improving Text Embeddings for Smaller Language Models Using Contrastive Fine-tuning

Authors

TL;DR

Abstract

Table of Contents

Figures (2)