Table of Contents
Fetching ...

Towards a Search Engine for Machines: Unified Ranking for Multiple Retrieval-Augmented Large Language Models

Alireza Salemi, Hamed Zamani

TL;DR

The paper tackles the problem of optimizing retrieval for multiple downstream RAG systems by introducing uRAG, a unified search engine that serves diverse RAG models as users. It formalizes a joint optimization framework where each model supplies a task-specific utility, enabling a cross-model reranking objective that leverages implicit feedback. Through a large-scale ecosystem of 18 RAG models and extensive experiments, the authors demonstrate that a unified reranker can match or exceed per-model retrievers, with statistically significant gains for a majority of models and meaningful generalization to unseen models and datasets. The work contributes a concrete formulation, a practical training guideline, and a scalable experimental platform, and it open-sources the code and trained parameters to accelerate research. This approach has practical impact for building scalable, machine-centered search systems that support multiple knowledge-grounding tasks across diverse LLM architectures.

Abstract

This paper introduces uRAG--a framework with a unified retrieval engine that serves multiple downstream retrieval-augmented generation (RAG) systems. Each RAG system consumes the retrieval results for a unique purpose, such as open-domain question answering, fact verification, entity linking, and relation extraction. We introduce a generic training guideline that standardizes the communication between the search engine and the downstream RAG systems that engage in optimizing the retrieval model. This lays the groundwork for us to build a large-scale experimentation ecosystem consisting of 18 RAG systems that engage in training and 18 unknown RAG systems that use the uRAG as the new users of the search engine. Using this experimentation ecosystem, we answer a number of fundamental research questions that improve our understanding of promises and challenges in developing search engines for machines.

Towards a Search Engine for Machines: Unified Ranking for Multiple Retrieval-Augmented Large Language Models

TL;DR

The paper tackles the problem of optimizing retrieval for multiple downstream RAG systems by introducing uRAG, a unified search engine that serves diverse RAG models as users. It formalizes a joint optimization framework where each model supplies a task-specific utility, enabling a cross-model reranking objective that leverages implicit feedback. Through a large-scale ecosystem of 18 RAG models and extensive experiments, the authors demonstrate that a unified reranker can match or exceed per-model retrievers, with statistically significant gains for a majority of models and meaningful generalization to unseen models and datasets. The work contributes a concrete formulation, a practical training guideline, and a scalable experimental platform, and it open-sources the code and trained parameters to accelerate research. This approach has practical impact for building scalable, machine-centered search systems that support multiple knowledge-grounding tasks across diverse LLM architectures.

Abstract

This paper introduces uRAG--a framework with a unified retrieval engine that serves multiple downstream retrieval-augmented generation (RAG) systems. Each RAG system consumes the retrieval results for a unique purpose, such as open-domain question answering, fact verification, entity linking, and relation extraction. We introduce a generic training guideline that standardizes the communication between the search engine and the downstream RAG systems that engage in optimizing the retrieval model. This lays the groundwork for us to build a large-scale experimentation ecosystem consisting of 18 RAG systems that engage in training and 18 unknown RAG systems that use the uRAG as the new users of the search engine. Using this experimentation ecosystem, we answer a number of fundamental research questions that improve our understanding of promises and challenges in developing search engines for machines.
Paper Structure (11 sections, 4 equations, 3 figures, 7 tables)

This paper contains 11 sections, 4 equations, 3 figures, 7 tables.

Figures (3)

  • Figure 1: A high-level overview of the uRAG ecosystem. The ecosystem consists of a shared search engine that serves multiple RAG models, each performing its own task.
  • Figure 2: An overview of interactions between RAG models (also known as predictive models) and the unified search engine.
  • Figure 3: The performance of unified retrieval model using different percentages of training data. The dashed line indicates that the model is trained on the full dataset. The far right plot demonstrate the overall average performance across all datasets.