OpenRAG: Optimizing RAG End-to-End via In-Context Retrieval Learning
Jiawei Zhou, Lei Chen
TL;DR
OpenRAG tackles the misalignment between IR-derived retriever relevance and RAG performance by learning in-context relevance end-to-end. It combines offline RAG warmup with online in-training retrieval using a semi-parametric disentangled retriever (SiDR) and a contrastive objective to align the retriever with downstream evaluation. Across four benchmarks, OpenRAG yields a 4.0% improvement over the original retriever and 2.1% over state-of-the-art retrievers, with notable gains on PubHealth and potential to surpass some 8B LLM-based approaches in cost-sensitive settings. The results demonstrate that retrieval learning is a potent lever for enhancing RAG systems and can transfer to other LLMs for open-ended generation, albeit with some limitations for closed-set tasks.
Abstract
In this paper, we analyze and empirically show that the learned relevance for conventional information retrieval (IR) scenarios may be inconsistent in retrieval-augmented generation (RAG) scenarios. To bridge this gap, we introduce OpenRAG, a RAG framework that is optimized end-to-end by tuning the retriever to capture in-context relevance, enabling adaptation to the diverse and evolving needs. Extensive experiments across a wide range of tasks demonstrate that OpenRAG, by tuning a retriever end-to-end, leads to a consistent improvement of 4.0% over the original retriever, consistently outperforming existing state-of-the-art retrievers by 2.1%. Additionally, our results indicate that for some tasks, an end-to-end tuned 0.2B retriever can achieve improvements that surpass those of RAG-oriented or instruction-tuned 8B large language models (LLMs), highlighting the cost-effectiveness of our approach in enhancing RAG systems.
