NV-Retriever: Improving text embedding models with effective hard-negative mining

Gabriel de Souza P. Moreira; Radek Osmulski; Mengyao Xu; Ronay Ak; Benedikt Schifferer; Even Oldridge

NV-Retriever: Improving text embedding models with effective hard-negative mining

Gabriel de Souza P. Moreira, Radek Osmulski, Mengyao Xu, Ronay Ak, Benedikt Schifferer, Even Oldridge

TL;DR

This work tackles the challenge of selecting high-quality hard negatives for contrastive fine-tuning of text embedding models. It introduces positive-aware hard-negative mining methods, notably TopK-MarginPos and TopK-PercPos, which leverage the positive relevance score to filter potential false negatives and stabilize training. Through extensive ablations across teacher-model choices, ensembling strategies, and mining configurations, the study demonstrates substantial gains in retrieval accuracy, culminating in the NV-Retriever-v1 model achieving top performance on MTEB Retrieval/BEIR benchmarks. The results show that careful mining strategy is a key driver of state-of-the-art dense retrieval, with practical implications for scalable, high-accuracy retrieval systems in RAG and semantic search contexts.

Abstract

Text embedding models have been popular for information retrieval applications such as semantic search and Question-Answering systems based on Retrieval-Augmented Generation (RAG). Those models are typically Transformer models that are fine-tuned with contrastive learning objectives. One of the challenging aspects of fine-tuning embedding models is the selection of high quality hard-negative passages for contrastive learning. In this paper we introduce a family of positive-aware mining methods that use the positive relevance score as an anchor for effective false negative removal, leading to faster training and more accurate retrieval models. We provide an ablation study on hard-negative mining methods over their configurations, exploring different teacher and base models. We further demonstrate the efficacy of our proposed mining methods at scale with the NV-Retriever-v1 model, which scores 60.9 on MTEB Retrieval (BEIR) benchmark and placed 1st when it was published to the MTEB Retrieval on July, 2024.

NV-Retriever: Improving text embedding models with effective hard-negative mining

TL;DR

Abstract

Paper Structure (32 sections, 2 equations, 4 figures, 10 tables, 3 algorithms)

This paper contains 32 sections, 2 equations, 4 figures, 10 tables, 3 algorithms.

Introduction
Background
Text embedding models
Hard-negative mining for fine-tuning embedding models
False negatives
Methodology
Positive-aware hard-negative mining methods
Research Questions
Experiments setup
Training
Evaluation
Experiment results and discussion
RQ1. Using different teacher models for mining
RQ2. Ensembling hard-negatives from different teacher models
RQ3.a Comparing methods for mining hard-negatives
...and 17 more sections

Figures (4)

Figure 1: Ablation study of the negative mining methods over their configuration.
Figure 2: Ablation study of TopK-PercPos negative mining method with 95% threshold + (Sampled Top-k) or (top1 + Sampled) sampling method. Four negatives are sampled among different top $k$
Figure 3: Percentage of relevant context - (true) positives and (false) mined negatives - as classified by LLM-as-a-judge (Llama 3.1 70b instruct).
Figure 4: Histograms comparing Naive Top-k and TopK-PercPos mining methods

NV-Retriever: Improving text embedding models with effective hard-negative mining

TL;DR

Abstract

NV-Retriever: Improving text embedding models with effective hard-negative mining

Authors

TL;DR

Abstract

Table of Contents

Figures (4)