Table of Contents
Fetching ...

PersonalizedRouter: Personalized LLM Routing via Graph-based User Preference Modeling

Zhongjie Dai, Tao Feng, Jiaxuan You

TL;DR

PersonalizedRouter presents a graph-based, inductive approach to personalized LLM routing by learning latent user preferences from interaction data through a heterogeneous graph of users, tasks, queries, and LLMs. By casting routing as a link-prediction problem and employing a heterogeneous GNN, it achieves inductive generalization to unseen users and unseen LLMs, validated under two evaluation strategies: multi-cost-efficiency and LLM-as-a-Judge. The authors introduce PersonaRoute-Bench, a large-scale benchmark with 1,000 simulated users and 10 LLMs, and demonstrate substantial gains over state-of-the-art baselines, including strong few-shot generalization. The work advances user-centric AI routing, enabling personalized, cost-aware, and style-aware LLM selection in multi-user environments with practical implications for scalable AI systems.

Abstract

The growing number of Large Language Models (LLMs) with diverse capabilities and response styles provides users with a wider range of choices, which presents challenges in selecting appropriate LLMs, as user preferences vary in terms of performance, cost, and response style. Current LLM selection methods typically optimize for a single fixed objective, such as performance, cost, or a trade-off between them, and fail to learn individual user preferences from interaction data. To address these limitations, we propose PersonalizedRouter, a graph-based framework that models diverse user profiles and performs personalized LLM selection by leveraging interaction data that includes task context, queries, candidate LLMs, and user decisions. To capture contextual information between user queries and optimal LLMs, PersonalizedRouter converts the interaction data into a heterogeneous graph, where the relationships between different types of nodes are represented by edges. To evaluate adaptability across users, we design two strategies: the multi-cost-efficiency simulation strategy and the LLM-as-a-Judge strategy. In addition, we construct PersonaRoute-Bench, a large-scale benchmark with 1,000 simulated users and 10 LLMs. Experimental results show that PersonalizedRouter significantly outperforms existing LLM selection methods and surpasses the strongest methods by a large margin of 15.38% and 9.83% under two simulation strategies. On the PersonaRoute-Bench with 1,000 users, it further surpasses the best methods by 16.19% and 59.69% while maintaining higher efficiency. Moreover, PersonalizedRouter demonstrates strong few-shot generalization, achieving 64.81% and 85.80% of the fully trained model's performance when adapting to new users and new LLMs.

PersonalizedRouter: Personalized LLM Routing via Graph-based User Preference Modeling

TL;DR

PersonalizedRouter presents a graph-based, inductive approach to personalized LLM routing by learning latent user preferences from interaction data through a heterogeneous graph of users, tasks, queries, and LLMs. By casting routing as a link-prediction problem and employing a heterogeneous GNN, it achieves inductive generalization to unseen users and unseen LLMs, validated under two evaluation strategies: multi-cost-efficiency and LLM-as-a-Judge. The authors introduce PersonaRoute-Bench, a large-scale benchmark with 1,000 simulated users and 10 LLMs, and demonstrate substantial gains over state-of-the-art baselines, including strong few-shot generalization. The work advances user-centric AI routing, enabling personalized, cost-aware, and style-aware LLM selection in multi-user environments with practical implications for scalable AI systems.

Abstract

The growing number of Large Language Models (LLMs) with diverse capabilities and response styles provides users with a wider range of choices, which presents challenges in selecting appropriate LLMs, as user preferences vary in terms of performance, cost, and response style. Current LLM selection methods typically optimize for a single fixed objective, such as performance, cost, or a trade-off between them, and fail to learn individual user preferences from interaction data. To address these limitations, we propose PersonalizedRouter, a graph-based framework that models diverse user profiles and performs personalized LLM selection by leveraging interaction data that includes task context, queries, candidate LLMs, and user decisions. To capture contextual information between user queries and optimal LLMs, PersonalizedRouter converts the interaction data into a heterogeneous graph, where the relationships between different types of nodes are represented by edges. To evaluate adaptability across users, we design two strategies: the multi-cost-efficiency simulation strategy and the LLM-as-a-Judge strategy. In addition, we construct PersonaRoute-Bench, a large-scale benchmark with 1,000 simulated users and 10 LLMs. Experimental results show that PersonalizedRouter significantly outperforms existing LLM selection methods and surpasses the strongest methods by a large margin of 15.38% and 9.83% under two simulation strategies. On the PersonaRoute-Bench with 1,000 users, it further surpasses the best methods by 16.19% and 59.69% while maintaining higher efficiency. Moreover, PersonalizedRouter demonstrates strong few-shot generalization, achieving 64.81% and 85.80% of the fully trained model's performance when adapting to new users and new LLMs.

Paper Structure

This paper contains 31 sections, 4 equations, 3 figures, 21 tables.

Figures (3)

  • Figure 1: Overview of PersonalizedRouter methodology. As shown in the left part, we first utilize the candidate LLMs to generate responses based on the multi-task dataset. Next, under two simulation strategies, we obtain the corresponding interaction data. As illustrated in the middle part, PersonalizedRouter transforms the user interaction data into a graph, where nodes represent the user, task, query, and LLM, and the edges capture the relationships between different node types. In the right part, we leverage a GNN to embed both node and edge features, updating and capturing the user’s hidden features. Ultimately, we select the optimal LLM from the predicted probability distribution.
  • Figure 2: Comparison of reward and accuracy under different GNN layer counts using two simulation strategies.
  • Figure 3: T-SNE Visualization of Routing Decisions. In the visualization, user embeddings are represented by circles, LLM embeddings by triangles, and the color indicates the assignment to a particular LLM. Users 0–2 are cost-oriented, while users 3–9 are performance-oriented. PersonalizedRouter successfully learns the latent preferences of users, separates the two groups, and ultimately performs personalized routing.