Table of Contents
Fetching ...

GraphRouter: A Graph-based Router for LLM Selections

Tao Feng, Yanzhen Shen, Jiaxuan You

TL;DR

This work tackles the problem of efficiently selecting among a large and evolving set of LLMs by exploiting contextual interactions among tasks, queries, and models. It introduces GraphRouter, an inductive heterogeneous graph framework that represents tasks, queries, and LLMs as node types and uses edge prediction to estimate LLM reward (performance) and cost, enabling zero-shot adaptation to new LLMs without retraining. Empirical results across multiple tasks show consistent improvements in the Reward metric and strong generalization to unseen LLMs, with substantial reductions in training time when using few-shot scores for new models. The authors release their code and demonstrate that a graph-based, context-aware router can outperform several baselines and approach an oracle upper bound in real-world LLM selection scenarios.

Abstract

The rapidly growing number and variety of Large Language Models (LLMs) present significant challenges in efficiently selecting the appropriate LLM for a given query, especially considering the trade-offs between performance and computational cost. Current LLM selection methods often struggle to generalize across new LLMs and different tasks because of their limited ability to leverage contextual interactions among tasks, queries, and LLMs, as well as their dependence on a transductive learning framework. To address these shortcomings, we introduce a novel inductive graph framework, named as GraphRouter, which fully utilizes the contextual information among tasks, queries, and LLMs to enhance the LLM selection process. GraphRouter constructs a heterogeneous graph comprising task, query, and LLM nodes, with interactions represented as edges, which efficiently captures the contextual information between the query's requirements and the LLM's capabilities. Through an innovative edge prediction mechanism, GraphRouter is able to predict attributes (the effect and cost of LLM response) of potential edges, allowing for optimized recommendations that adapt to both existing and newly introduced LLMs without requiring retraining. Comprehensive experiments across three distinct effect-cost weight scenarios have shown that GraphRouter substantially surpasses existing routers, delivering a minimum performance improvement of 12.3%. In addition, it achieves enhanced generalization across new LLMs settings and supports diverse tasks with at least a 9.5% boost in effect and a significant reduction in computational demands. This work endeavors to apply a graph-based approach for the contextual and adaptive selection of LLMs, offering insights for real-world applications. Our codes for GraphRouter is released at https://github.com/ulab-uiuc/GraphRouter.

GraphRouter: A Graph-based Router for LLM Selections

TL;DR

This work tackles the problem of efficiently selecting among a large and evolving set of LLMs by exploiting contextual interactions among tasks, queries, and models. It introduces GraphRouter, an inductive heterogeneous graph framework that represents tasks, queries, and LLMs as node types and uses edge prediction to estimate LLM reward (performance) and cost, enabling zero-shot adaptation to new LLMs without retraining. Empirical results across multiple tasks show consistent improvements in the Reward metric and strong generalization to unseen LLMs, with substantial reductions in training time when using few-shot scores for new models. The authors release their code and demonstrate that a graph-based, context-aware router can outperform several baselines and approach an oracle upper bound in real-world LLM selection scenarios.

Abstract

The rapidly growing number and variety of Large Language Models (LLMs) present significant challenges in efficiently selecting the appropriate LLM for a given query, especially considering the trade-offs between performance and computational cost. Current LLM selection methods often struggle to generalize across new LLMs and different tasks because of their limited ability to leverage contextual interactions among tasks, queries, and LLMs, as well as their dependence on a transductive learning framework. To address these shortcomings, we introduce a novel inductive graph framework, named as GraphRouter, which fully utilizes the contextual information among tasks, queries, and LLMs to enhance the LLM selection process. GraphRouter constructs a heterogeneous graph comprising task, query, and LLM nodes, with interactions represented as edges, which efficiently captures the contextual information between the query's requirements and the LLM's capabilities. Through an innovative edge prediction mechanism, GraphRouter is able to predict attributes (the effect and cost of LLM response) of potential edges, allowing for optimized recommendations that adapt to both existing and newly introduced LLMs without requiring retraining. Comprehensive experiments across three distinct effect-cost weight scenarios have shown that GraphRouter substantially surpasses existing routers, delivering a minimum performance improvement of 12.3%. In addition, it achieves enhanced generalization across new LLMs settings and supports diverse tasks with at least a 9.5% boost in effect and a significant reduction in computational demands. This work endeavors to apply a graph-based approach for the contextual and adaptive selection of LLMs, offering insights for real-world applications. Our codes for GraphRouter is released at https://github.com/ulab-uiuc/GraphRouter.
Paper Structure (21 sections, 4 equations, 7 figures, 23 tables, 1 algorithm)

This paper contains 21 sections, 4 equations, 7 figures, 23 tables, 1 algorithm.

Figures (7)

  • Figure 1: Overview of GraphRouter's LLM selection process. As depicted in the left section, the LLM selection process begins with the user inputting a query that belongs to a certain task. When the router receives the query and the task, it will analyze the input and choose the most appropriate $LLM_n$ for generation. Then, $LLM_n$ is used to generate the response. In the end, this response, along with the measured effectiveness and cost, is returned to the user. The right side of the figure illustrates example interaction records, which contain contextual information like task, user query, selected LLM, response, performance, and cost, in a table. These contextualized data are then utilized to train the router.
  • Figure 2: The probability distribution of a small LLM (LLaMA-3 (7b)) having a better performance value than a large LLM (LLaMA-3 (70b)) by $t$ on the Alpaca dataset, where $t$ means the difference in performance between the small LLM and the large LLM and $t \in [-1, 1]$.
  • Figure 3: Distribution of the performance of different LLMs in response to queries on the Alpaca task. Specifically, we present a violin plot illustrating the performance of ten LLMs of varying sizes and the dot in each distribution is the median performance.
  • Figure 4: Distribution of the performance of different LLMs responding to queries on the SQUAD task. In particular, the performance of ten LLMs of varying sizes is displayed in a violin plot and the dot in each distribution is the median performance.
  • Figure 5: Overview of GraphRouter methodology. GraphRouter first converts the interaction data among tasks, queries, and LLMs into a graph. Specifically, as illustrated on the right side, tasks, queries, and LLMs from the left table are represented as task nodes, query nodes, and LLM nodes, respectively. Moreover, their relationships derived from the interaction data are modeled as edge features. With this structure, we leverage a GNN to embed both node and edge features, ultimately producing the probability distribution of the selected LLM.
  • ...and 2 more figures