Table of Contents
Fetching ...

jina-embeddings-v3: Multilingual Embeddings With Task LoRA

Saba Sturua, Isabelle Mohr, Mohammad Kalim Akram, Michael Günther, Bo Wang, Markus Krimmel, Feng Wang, Georgios Mastrapas, Andreas Koukounas, Nan Wang, Han Xiao

TL;DR

Jina-embeddings-v3 delivers multilingual, long-context embeddings with a compact 570M parameter backbone enhanced by task-specific LoRA adapters and Matryoshka representation learning. The approach combines RoPE-based long-context encoding, a frozen XLM-RoBERTa foundation, and dedicated adapters for retrieval, clustering, classification, and text matching, achieving 1024-dimensional embeddings with scalable performance. Evaluations on MTEB show strong monolingual English results and competitive multilingual performance, surpassing several proprietary multilingual embeddings while offering substantial cost advantages over large LLM-based alternatives. The work also analyzes retrieval failures and demonstrates robust improvements via synthetic data augmentation and ablation studies on embedding dimension and retrieval asymmetry, highlighting practical potential for production and edge deployment.

Abstract

We introduce jina-embeddings-v3, a novel text embedding model with 570 million parameters, achieves state-of-the-art performance on multilingual data and long-context retrieval tasks, supporting context lengths of up to 8192 tokens. The model includes a set of task-specific Low-Rank Adaptation (LoRA) adapters to generate high-quality embeddings for query-document retrieval, clustering, classification, and text matching. Evaluation on the MTEB benchmark shows that jina-embeddings-v3 outperforms the latest proprietary embeddings from OpenAI and Cohere on English tasks, while achieving superior performance compared to multilingual-e5-large-instruct across all multilingual tasks. With a default output dimension of 1024, users can flexibly reduce the embedding dimensions to as low as 32 without compromising performance, enabled by Matryoshka Representation Learning.

jina-embeddings-v3: Multilingual Embeddings With Task LoRA

TL;DR

Jina-embeddings-v3 delivers multilingual, long-context embeddings with a compact 570M parameter backbone enhanced by task-specific LoRA adapters and Matryoshka representation learning. The approach combines RoPE-based long-context encoding, a frozen XLM-RoBERTa foundation, and dedicated adapters for retrieval, clustering, classification, and text matching, achieving 1024-dimensional embeddings with scalable performance. Evaluations on MTEB show strong monolingual English results and competitive multilingual performance, surpassing several proprietary multilingual embeddings while offering substantial cost advantages over large LLM-based alternatives. The work also analyzes retrieval failures and demonstrates robust improvements via synthetic data augmentation and ablation studies on embedding dimension and retrieval asymmetry, highlighting practical potential for production and edge deployment.

Abstract

We introduce jina-embeddings-v3, a novel text embedding model with 570 million parameters, achieves state-of-the-art performance on multilingual data and long-context retrieval tasks, supporting context lengths of up to 8192 tokens. The model includes a set of task-specific Low-Rank Adaptation (LoRA) adapters to generate high-quality embeddings for query-document retrieval, clustering, classification, and text matching. Evaluation on the MTEB benchmark shows that jina-embeddings-v3 outperforms the latest proprietary embeddings from OpenAI and Cohere on English tasks, while achieving superior performance compared to multilingual-e5-large-instruct across all multilingual tasks. With a default output dimension of 1024, users can flexibly reduce the embedding dimensions to as low as 32 without compromising performance, enabled by Matryoshka Representation Learning.
Paper Structure (25 sections, 5 equations, 2 figures, 17 tables)

This paper contains 25 sections, 5 equations, 2 figures, 17 tables.

Figures (2)

  • Figure 1: The architecture of jina-embeddings-v3 is based on the XLM-RoBERTa model, with several key modifications. FlashAttention 2 is integrated for enhanced computational efficiency, while RoPE extends support for sequences up to 8192 tokens. Task-specific LoRA adapters are introduced to optimize embeddings for various tasks. The model’s input consists of two parts: the text, which is the long document to be embedded, and the task type. jina-embeddings-v3 supports four tasks and implements five adapters to choose from: retrieval.query and retrieval.passage for query and passage embeddings in asymmetric retrieval tasks, separation for clustering and reranking tasks, classification for classification tasks, and text-matching for tasks involving semantic similarity, such as STS or symmetric retrieval.
  • Figure 2: Scaling law of embedding models. Each dot represents an embedding model. jina-embeddings-v3 demonstrates superior performance compared to models of similar size, showing a superlinear improvement over its predecessor, jina-embeddings-v2. This graph was created by selecting 100 embedding models from the MTEB leaderboard, excluding those without size information, typically closed-source or proprietary models. Submissions identified as outliers or trolling were also filtered out. The average MTEB performance on English tasks is plotted against the number of model parameters. The trendline, representing all models, is highlighted, with multilingual models emphasized in orange.