EllieSQL: Cost-Efficient Text-to-SQL with Complexity-Aware Routing
Yizhang Zhu, Runzhi Jiang, Boyan Li, Nan Tang, Yuyu Luo
TL;DR
EllieSQL tackles the unsustainable token costs of state-of-the-art Text-to-SQL by introducing a complexity-aware routing framework that directs queries to one of three tiered SQL-generation pipelines (Basic, Intermediate, Advanced) based on estimated query complexity. It formalizes and employs the Token Elasticity of Performance ($TEP$) metric to quantify cost-efficiency, and validates that routing can achieve comparable or better execution accuracy ($EX$) while dramatically reducing token usage. Across Bird and Spider datasets, a router such as Qwen DPO reduces token consumption by more than 40% relative to always using the most advanced pipeline, yielding more than a 2× improvement in $TEP$ and demonstrating robust generalization to out-of-distribution data. These results highlight the practical value of balancing resource expenditure with performance for scalable, sustainable Text-to-SQL deployment and invite the community to pursue cost-aware, pipeline-aware strategies alongside accuracy improvements.
Abstract
Text-to-SQL automatically translates natural language queries to SQL, allowing non-technical users to retrieve data from databases without specialized SQL knowledge. Despite the success of advanced LLM-based Text-to-SQL approaches on leaderboards, their unsustainable computational costs--often overlooked--stand as the "elephant in the room" in current leaderboard-driven research, limiting their economic practicability for real-world deployment and widespread adoption. To tackle this, we exploratively propose EllieSQL, a complexity-aware routing framework that assigns queries to suitable SQL generation pipelines based on estimated complexity. We investigate multiple routers to direct simple queries to efficient approaches while reserving computationally intensive methods for complex cases. Drawing from economics, we introduce the Token Elasticity of Performance (TEP) metric, capturing cost-efficiency by quantifying the responsiveness of performance gains relative to token investment in SQL generation. Experiments show that compared to always using the most advanced methods in our study, EllieSQL with the Qwen2.5-0.5B-DPO router reduces token use by over 40% without compromising performance on Bird development set, achieving more than a 2x boost in TEP over non-routing approaches. This not only advances the pursuit of cost-efficient Text-to-SQL but also invites the community to weigh resource efficiency alongside performance, contributing to progress in sustainable Text-to-SQL. Our source code and model are available at https://elliesql.github.io/.
