Query Routing for Homogeneous Tools: An Instantiation in the RAG Scenario

Feiteng Mu; Yong Jiang; Liwen Zhang; Chu Liu; Wenjie Li; Pengjun Xie; Fei Huang

Query Routing for Homogeneous Tools: An Instantiation in the RAG Scenario

Feiteng Mu, Yong Jiang, Liwen Zhang, Chu Liu, Wenjie Li, Pengjun Xie, Fei Huang

TL;DR

This paper addresses the selection of homogeneous tools by predicting both their performance and the associated cost required to accomplish a given task, and then assigns queries to the optimal tools in a cost-effective manner.

Abstract

Current research on tool learning primarily focuses on selecting the most effective tool from a wide array of options, often overlooking cost-effectiveness, a crucial factor in human problem-solving. In this paper, we address the selection of homogeneous tools by predicting both their performance and the associated cost required to accomplish a given task. We then assign queries to the optimal tools in a cost-effective manner. Our experimental results demonstrate that our method achieves higher performance at a lower cost compared to strong baseline approaches.

Query Routing for Homogeneous Tools: An Instantiation in the RAG Scenario

TL;DR

Abstract

Paper Structure (27 sections, 6 equations, 3 figures, 4 tables)

This paper contains 27 sections, 6 equations, 3 figures, 4 tables.

Introduction
Related Work
Tool Learning
LLMs Selection
Method
Problem Formulation
Training Predicative Model
Data Preparation
Training
Assignment Strategies
Experiment
Experimental Setup
Datasets and Training details
Baselines
Evaluation the Predictive Model
...and 12 more sections

Figures (3)

Figure 1: The win/tie/lose rates on our PrivateTimeQA test set when the LLM uses two compared tools. Besides Bing and Google, we consider a non-retrieval baseline, and denote the method as "LLM-only".
Figure 2: The framework of our method. We first predict the scores where the LLM calls each tool to solve each query. Then, we design different strategies to assign each query to the optimal tool on demand.
Figure 3: The cost-accuracy curves on PrivateTimeQA and CDQA when using Qwen-max.

Query Routing for Homogeneous Tools: An Instantiation in the RAG Scenario

TL;DR

Abstract

Query Routing for Homogeneous Tools: An Instantiation in the RAG Scenario

Authors

TL;DR

Abstract

Table of Contents

Figures (3)