AGENTIQL: An Agent-Inspired Multi-Expert Framework for Text-to-SQL Generation
Omid Reza Heidari, Siobhan Reid, Yassine Yaakoubi
TL;DR
AGENTIQL introduces an agent-inspired multi-expert framework for NL2SQL that decomposes complex queries into sub-questions, generates sub-queries with specialized coding agents, refines the final column selection, and dynamically routes between a modular divide-and-merge pipeline and a baseline parser. The approach yields improved execution accuracy on Spider, with up to $86.07\%$ EX using $14$-billion-parameter models and nearing SOTA ($89.65\%$ EX) with larger baselines, while also increasing interpretability through explicit intermediate steps. Key contributions include the Divide-and-Merge module, column selection refinement, and an adaptive router that leverages complementary strengths of the components. The work demonstrates that modular reasoning, targeted code generation, and adaptive routing can deliver robust, scalable, and transparent NL2SQL systems, with practical implications for handling diverse schemas and complex queries.
Abstract
LLMs have advanced text-to-SQL generation, yet monolithic architectures struggle with complex reasoning and schema diversity. We propose AGENTIQL, an agent-inspired multi-expert framework that combines a reasoning agent for question decomposition, a coding agent for sub-query generation, and a refinement step for column selection. An adaptive router further balances efficiency and accuracy by selecting between our modular pipeline and a baseline parser. Several steps in the pipeline can be executed in parallel, making the framework scalable to larger workloads. Evaluated on the Spider benchmark, AGENTIQL improves execution accuracy and interpretability and achieves up to 86.07% EX with 14B models using the Planner&Executor merging strategy. The attained performance is contingent upon the efficacy of the routing mechanism, thereby narrowing the gap to GPT-4-based SOTA (89.65% EX) while using much smaller open-source LLMs. Beyond accuracy, AGENTIQL enhances transparency by exposing intermediate reasoning steps, offering a robust, scalable, and interpretable approach to semantic parsing.
