Table of Contents
Fetching ...

Query Routing for Homogeneous Tools: An Instantiation in the RAG Scenario

Feiteng Mu, Yong Jiang, Liwen Zhang, Chu Liu, Wenjie Li, Pengjun Xie, Fei Huang

TL;DR

This paper addresses the selection of homogeneous tools by predicting both their performance and the associated cost required to accomplish a given task, and then assigns queries to the optimal tools in a cost-effective manner.

Abstract

Current research on tool learning primarily focuses on selecting the most effective tool from a wide array of options, often overlooking cost-effectiveness, a crucial factor in human problem-solving. In this paper, we address the selection of homogeneous tools by predicting both their performance and the associated cost required to accomplish a given task. We then assign queries to the optimal tools in a cost-effective manner. Our experimental results demonstrate that our method achieves higher performance at a lower cost compared to strong baseline approaches.

Query Routing for Homogeneous Tools: An Instantiation in the RAG Scenario

TL;DR

This paper addresses the selection of homogeneous tools by predicting both their performance and the associated cost required to accomplish a given task, and then assigns queries to the optimal tools in a cost-effective manner.

Abstract

Current research on tool learning primarily focuses on selecting the most effective tool from a wide array of options, often overlooking cost-effectiveness, a crucial factor in human problem-solving. In this paper, we address the selection of homogeneous tools by predicting both their performance and the associated cost required to accomplish a given task. We then assign queries to the optimal tools in a cost-effective manner. Our experimental results demonstrate that our method achieves higher performance at a lower cost compared to strong baseline approaches.
Paper Structure (27 sections, 6 equations, 3 figures, 4 tables)

This paper contains 27 sections, 6 equations, 3 figures, 4 tables.

Figures (3)

  • Figure 1: The win/tie/lose rates on our PrivateTimeQA test set when the LLM uses two compared tools. Besides Bing and Google, we consider a non-retrieval baseline, and denote the method as "LLM-only".
  • Figure 2: The framework of our method. We first predict the scores where the LLM calls each tool to solve each query. Then, we design different strategies to assign each query to the optimal tool on demand.
  • Figure 3: The cost-accuracy curves on PrivateTimeQA and CDQA when using Qwen-max.