Unsupervised Query Routing for Retrieval Augmented Generation
Feiteng Mu, Liwen Zhang, Yong Jiang, Wenjie Li, Zhen Zhang, Pengjun Xie, Fei Huang
TL;DR
The paper tackles the challenge of routing queries to the most suitable search engine in retrieval-augmented generation without relying on annotated data. It introduces an unsupervised framework that uses multi-source retrieval as an upper bound to evaluate single-source responses, enabling automatic label generation from real user queries. Labels are derived from a combination of similarity (BertScore) and coherence (LLM-based ranking) metrics, with a ListMLE loss guiding the training of a routing model. Across five datasets and multiple LLMs, the approach demonstrates strong scalability and generalization, offering a practical path to scalable tool learning in RAG systems.
Abstract
Query routing for retrieval-augmented generation aims to assign an input query to the most suitable search engine. Existing works rely heavily on supervised datasets that require extensive manual annotation, resulting in high costs and limited scalability, as well as poor generalization to out-of-distribution scenarios. To address these challenges, we introduce a novel unsupervised method that constructs the "upper-bound" response to evaluate the quality of retrieval-augmented responses. This evaluation enables the decision of the most suitable search engine for a given query. By eliminating manual annotations, our approach can automatically process large-scale real user queries and create training data. We conduct extensive experiments across five datasets, demonstrating that our method significantly enhances scalability and generalization capabilities.
