GenEOL: Harnessing the Generative Power of LLMs for Training-Free Sentence Embeddings
Raghuveer Thirukovalluru, Bhuwan Dhingra
TL;DR
GenEOL addresses the challenge of training-free sentence embeddings by leveraging the generative power of LLMs to create $m$ meaning-preserving transformations of each sentence and averaging their embeddings with the original. By combining an instruction-tuned generator with a pretrained embedder using a fixed EOL prompt, GenEOL achieves robust, high-quality representations that surpass prior training-free methods on STS and MTEB benchmarks, and stabilize across LLM layers. Key insights include the importance of diverse transformations, the benefit of compositional summaries, and the resilience of GenEOL to prompt perturbations, with notable gains even at small $m$. The approach demonstrates a practical trade-off between inference-time compute and embedding quality, offering a path toward high-performance, training-free sentence representations that can operate with black-box LLMs, while acknowledging the higher computational cost and potential content hallucinations in generated variants.
Abstract
Training-free embedding methods directly leverage pretrained large language models (LLMs) to embed text, bypassing the costly and complex procedure of contrastive learning. Previous training-free embedding methods have mainly focused on optimizing embedding prompts and have overlooked the benefits of utilizing the generative abilities of LLMs. We propose a novel method, GenEOL, which uses LLMs to generate diverse transformations of a sentence that preserve its meaning, and aggregates the resulting embeddings of these transformations to enhance the overall sentence embedding. GenEOL significantly outperforms the existing training-free embedding methods by an average of 2.85 points across several LLMs on the sentence semantic text similarity (STS) benchmark. GenEOL also achieves notable gains in clustering, reranking, and pair-classification tasks from the MTEB benchmark. Additionally, GenEOL stabilizes representation quality across LLM layers and remains robust to perturbations of embedding prompts.
