UncertaintyRAG: Span-Level Uncertainty Enhanced Long-Context Modeling for Retrieval-Augmented Generation

Zixuan Li; Jing Xiong; Fanghua Ye; Chuanyang Zheng; Xun Wu; Jianqiao Lu; Zhongwei Wan; Xiaodan Liang; Chengming Li; Zhenan Sun; Lingpeng Kong; Ngai Wong

UncertaintyRAG: Span-Level Uncertainty Enhanced Long-Context Modeling for Retrieval-Augmented Generation

Zixuan Li, Jing Xiong, Fanghua Ye, Chuanyang Zheng, Xun Wu, Jianqiao Lu, Zhongwei Wan, Xiaodan Liang, Chengming Li, Zhenan Sun, Lingpeng Kong, Ngai Wong

TL;DR

UncertaintyRAG addresses the challenge of long-context retrieval in RAG by introducing a span-level uncertainty measure based on the Signal-to-Noise Ratio (SNR) to calibrate chunk similarities. It trains a robust, unsupervised retrieval model through a contrastive objective that uses span-uncertainty-derived positives and negatives, coupled with scalable data-sampling strategies across diverse datasets. Empirical results show improved performance under distribution shift and strong calibration, achieving state-of-the-art-like results with only a fraction of the data required by open-source baselines and without fine-tuning the LLM. The method provides a lightweight, plug-and-play retrieval component that can be integrated with various LLMs and context window lengths, offering a practical solution for robust long-context QA and generation tasks.

Abstract

We present UncertaintyRAG, a novel approach for long-context Retrieval-Augmented Generation (RAG) that utilizes Signal-to-Noise Ratio (SNR)-based span uncertainty to estimate similarity between text chunks. This span uncertainty enhances model calibration, improving robustness and mitigating semantic inconsistencies introduced by random chunking. Leveraging this insight, we propose an efficient unsupervised learning technique to train the retrieval model, alongside an effective data sampling and scaling strategy. UncertaintyRAG outperforms baselines by 2.03% on LLaMA-2-7B, achieving state-of-the-art results while using only 4% of the training data compared to other advanced open-source retrieval models under distribution shift settings. Our method demonstrates strong calibration through span uncertainty, leading to improved generalization and robustness in long-context RAG tasks. Additionally, UncertaintyRAG provides a lightweight retrieval model that can be integrated into any large language model with varying context window lengths, without the need for fine-tuning, showcasing the flexibility of our approach.

UncertaintyRAG: Span-Level Uncertainty Enhanced Long-Context Modeling for Retrieval-Augmented Generation

TL;DR

Abstract

Paper Structure (35 sections, 8 equations, 6 figures, 7 tables)

This paper contains 35 sections, 8 equations, 6 figures, 7 tables.

Introduction
Related Work
Attention Mechanisms in Long Contexts
Position Encoding
Retrieval-Augmented Generation
Methodology
Span Uncertainty
Training Strategy
Construction of Positive and Negative Samples
Data Scaling Strategy
Anchor Sample Scaling Strategy
Positive and Negative Sample Scaling Strategy
Contrastive Learning
Model Inference
Experiment
...and 20 more sections

Figures (6)

Figure 1: Each line in the figure represents the trend of SNR variation for different samples, where two chunks are concatenated and input into the LLM for uncertainty estimation. The SNR is calculated as a sliding window moves across the concatenated input. Notably, the SNR values exhibit a significant drop early on, even before reaching the end of the first chunk.
Figure 2: Scaling and Trainning. The figure presents the details of scaling and training.
Figure 3: Representation Similarity Analysis
Figure 4: Align and Uniform. This figure shows uniformity and alignment of different chunk embedding along with their averaged semantic textual similarity (STS conneau2017supervised) results.
Figure 5: AUROCs of Uncertainty Measures. The horizontal axis represents the threshold $\tau$.
...and 1 more figures

UncertaintyRAG: Span-Level Uncertainty Enhanced Long-Context Modeling for Retrieval-Augmented Generation

TL;DR

Abstract

UncertaintyRAG: Span-Level Uncertainty Enhanced Long-Context Modeling for Retrieval-Augmented Generation

Authors

TL;DR

Abstract

Table of Contents

Figures (6)