Routesplain: Towards Faithful and Intervenable Routing for Software-related Tasks
Adam Štorek, Vikas Upadhyay, Marianne Menglin Liu, Daniel W. Peterson, Anshul Mittal, Sujeeth Bharadwaj, Fahad Shah, Dan Roth
TL;DR
Routesplain introduces a concept-based LLM router for software-related tasks, formulating routing as a two-stage process where embeddings → concepts $h: \mathbb{R}^d \to \mathbb{R}^k$ and concepts → model scores $g: \mathbb{R}^k \to \mathbb{R}^n$ yield $f = g \circ h$. Training optimizes $\hat{h}$ with $L_{BCE}(h(\mathbf{x}^i), \mathbf{c}^i)$ and $\hat{g}$ with $L_{BCE}(g(\mathbf{c}^i), \mathbf{m}^i) + \lambda \cdot \mathrm{cost}(\mathbf{x}^i)$, enabling cost-aware routing and human-intervenable concept edits during inference. The evaluation spans eight software-related tasks and 16 LLMs, using 38,685 examples across diverse datasets, with Routesplain achieving competitive accuracy and Pareto-optimal cost-accuracy tradeoffs while providing faithful, interpretable rationales. Ablation and counterfactual concept-manipulation experiments reveal that predicting query complexity is the principal bottleneck, guiding targeted improvements; concept-level interventions demonstrate predictable shifts in routing decisions, validating controllability. Overall, Routesplain demonstrates that interpretable, domain-specific routing can match or exceed black-box baselines and offers diagnostic insights for future router enhancements with practical cost benefits.
Abstract
LLMs now tackle a wide range of software-related tasks, yet we show that their performance varies markedly both across and within these tasks. Routing user queries to the appropriate LLMs can therefore help improve response quality while reducing cost. Prior work, however, has focused mainly on general-purpose LLM routing via black-box models. We introduce Routesplain, the first LLM router for software-related tasks, including multilingual code generation and repair, input/output prediction, and computer science QA. Unlike existing routing approaches, Routesplain first extracts human-interpretable concepts from each query (e.g., task, domain, reasoning complexity) and only routes based on these concepts, thereby providing intelligible, faithful rationales. We evaluate Routesplain on 16 state-of-the-art LLMs across eight software-related tasks; Routesplain outperforms individual models both in terms of accuracy and cost, and equals or surpasses all black-box baselines, with concept-level intervention highlighting avenues for further router improvements.
