ProxRouter: Proximity-Weighted LLM Query Routing for Improved Robustness to Outliers

Shivam Patel; Neharika Jali; Ankur Mallick; Gauri Joshi

ProxRouter: Proximity-Weighted LLM Query Routing for Improved Robustness to Outliers

Shivam Patel, Neharika Jali, Ankur Mallick, Gauri Joshi

TL;DR

ProxRouter addresses robust LLM query routing under outlier shifts by reframing nonparametric routers (clustering and $k$NN) into a proximity-weighted aggregation framework. It introduces minimum-variance priors and an exponential proximity tilt controlled by a tunable parameter $\tau$ to produce weights $w_i(\mathbf{x}) \propto p_i(\mathbf{x})\exp(-\phi_i(\mathbf{x})/\tau)$, improving estimates of model utility $\widehat{U}^{(m)}(\mathbf{x})$ and reducing bias without requiring outlier detection. The authors formalize a unified representation for KM and $k$NN routers and demonstrate substantial outlier generalization gains across 14 LLMs and 10 datasets, with KM-Prox and $k$NN-Prox approaching the AllSee upper bound while preserving inlier performance. Experimental results show notable increases in AUC for outlier settings (e.g., $K$M-Prox achieving up to 75.12% vs 70.68% base; $k$NN-Prox achieving 68.12% vs 63.98% base) and minimal routing overhead, validating practical deployment potential. The framework also provides a mechanism to trigger router retraining based on model-ranking similarity, promoting stable performance with evolving task distributions.

Abstract

Large language model (LLM) query routers are critical to modern AI platforms as they seek to improve efficiency by assigning inference queries to accurate, yet low-cost models. Parametric routers typically use trained neural networks for LLM selection but suffer from retraining and maintenance overheads. Nonparametric routers are training-free, instead estimating LLM accuracy and cost via similarity between encodings of the input query and training set queries. However, like their parametric counterparts, nonparametric routers struggle to generalize to outlier queries, an issue exacerbated by limited diversity in training sets which are costly to expand and difficult to keep current with ever-evolving use cases. We propose ProxRouter, which applies an exponentially tilted aggregation mechanism to balance bias and variance in nonparametric routers, improving their robustness to outliers. Experiments show ProxRouter enhances outlier routing while preserving inlier performance with minimal overhead.

ProxRouter: Proximity-Weighted LLM Query Routing for Improved Robustness to Outliers

TL;DR

ProxRouter addresses robust LLM query routing under outlier shifts by reframing nonparametric routers (clustering and

NN) into a proximity-weighted aggregation framework. It introduces minimum-variance priors and an exponential proximity tilt controlled by a tunable parameter

to produce weights

, improving estimates of model utility

and reducing bias without requiring outlier detection. The authors formalize a unified representation for KM and

NN routers and demonstrate substantial outlier generalization gains across 14 LLMs and 10 datasets, with KM-Prox and

NN-Prox approaching the AllSee upper bound while preserving inlier performance. Experimental results show notable increases in AUC for outlier settings (e.g.,

M-Prox achieving up to 75.12% vs 70.68% base;

NN-Prox achieving 68.12% vs 63.98% base) and minimal routing overhead, validating practical deployment potential. The framework also provides a mechanism to trigger router retraining based on model-ranking similarity, promoting stable performance with evolving task distributions.

ProxRouter: Proximity-Weighted LLM Query Routing for Improved Robustness to Outliers

TL;DR

Abstract

ProxRouter: Proximity-Weighted LLM Query Routing for Improved Robustness to Outliers

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (13)