Table of Contents
Fetching ...

AGENTIQL: An Agent-Inspired Multi-Expert Framework for Text-to-SQL Generation

Omid Reza Heidari, Siobhan Reid, Yassine Yaakoubi

TL;DR

AGENTIQL introduces an agent-inspired multi-expert framework for NL2SQL that decomposes complex queries into sub-questions, generates sub-queries with specialized coding agents, refines the final column selection, and dynamically routes between a modular divide-and-merge pipeline and a baseline parser. The approach yields improved execution accuracy on Spider, with up to $86.07\%$ EX using $14$-billion-parameter models and nearing SOTA ($89.65\%$ EX) with larger baselines, while also increasing interpretability through explicit intermediate steps. Key contributions include the Divide-and-Merge module, column selection refinement, and an adaptive router that leverages complementary strengths of the components. The work demonstrates that modular reasoning, targeted code generation, and adaptive routing can deliver robust, scalable, and transparent NL2SQL systems, with practical implications for handling diverse schemas and complex queries.

Abstract

LLMs have advanced text-to-SQL generation, yet monolithic architectures struggle with complex reasoning and schema diversity. We propose AGENTIQL, an agent-inspired multi-expert framework that combines a reasoning agent for question decomposition, a coding agent for sub-query generation, and a refinement step for column selection. An adaptive router further balances efficiency and accuracy by selecting between our modular pipeline and a baseline parser. Several steps in the pipeline can be executed in parallel, making the framework scalable to larger workloads. Evaluated on the Spider benchmark, AGENTIQL improves execution accuracy and interpretability and achieves up to 86.07% EX with 14B models using the Planner&Executor merging strategy. The attained performance is contingent upon the efficacy of the routing mechanism, thereby narrowing the gap to GPT-4-based SOTA (89.65% EX) while using much smaller open-source LLMs. Beyond accuracy, AGENTIQL enhances transparency by exposing intermediate reasoning steps, offering a robust, scalable, and interpretable approach to semantic parsing.

AGENTIQL: An Agent-Inspired Multi-Expert Framework for Text-to-SQL Generation

TL;DR

AGENTIQL introduces an agent-inspired multi-expert framework for NL2SQL that decomposes complex queries into sub-questions, generates sub-queries with specialized coding agents, refines the final column selection, and dynamically routes between a modular divide-and-merge pipeline and a baseline parser. The approach yields improved execution accuracy on Spider, with up to EX using -billion-parameter models and nearing SOTA ( EX) with larger baselines, while also increasing interpretability through explicit intermediate steps. Key contributions include the Divide-and-Merge module, column selection refinement, and an adaptive router that leverages complementary strengths of the components. The work demonstrates that modular reasoning, targeted code generation, and adaptive routing can deliver robust, scalable, and transparent NL2SQL systems, with practical implications for handling diverse schemas and complex queries.

Abstract

LLMs have advanced text-to-SQL generation, yet monolithic architectures struggle with complex reasoning and schema diversity. We propose AGENTIQL, an agent-inspired multi-expert framework that combines a reasoning agent for question decomposition, a coding agent for sub-query generation, and a refinement step for column selection. An adaptive router further balances efficiency and accuracy by selecting between our modular pipeline and a baseline parser. Several steps in the pipeline can be executed in parallel, making the framework scalable to larger workloads. Evaluated on the Spider benchmark, AGENTIQL improves execution accuracy and interpretability and achieves up to 86.07% EX with 14B models using the Planner&Executor merging strategy. The attained performance is contingent upon the efficacy of the routing mechanism, thereby narrowing the gap to GPT-4-based SOTA (89.65% EX) while using much smaller open-source LLMs. Beyond accuracy, AGENTIQL enhances transparency by exposing intermediate reasoning steps, offering a robust, scalable, and interpretable approach to semantic parsing.

Paper Structure

This paper contains 22 sections, 8 figures, 4 tables.

Figures (8)

  • Figure 1: The Divide-and-Merge module of AgentiQL. The reasoning agent splits an input natural language query into multiple sub-questions. The coding agent then generates a corresponding SQL sub-query for each sub-question, and finally all sub-queries are merged to produce the final SQL query. This multi-step approach explicitly exposes intermediate reasoning steps and ensures the final query aligns with the question’s intent.
  • Figure 2: Overall architecture of AgentiQL. An input query is first evaluated by an adaptive router, which decides whether to send it to a one-step baseline parser or to the divide-and-merge module. The baseline directly generates the SQL query, while the divide-and-merge module processes the query through a multi-expert pipeline before producing the final SQL. This design balances interpretability, provided by the modular pipeline, with efficiency, provided by direct execution for simpler cases.
  • Figure 3: Success case with Qwen2.5-7B-Instruct and Qwen2.5-Coder-7B-Instruct using the Planner&Executor merging strategy. The query requires finding customers with both more than two orders and at least three items. The baseline SQL fails by joining Customers directly with Order_Items, violating the schema. The Divide-and-Merge approach, however, decomposes the task, routes through Orders, and intersects constraints, producing a valid SQL that retrieves the correct customers.
  • Figure 4: Success case with Qwen2.5-14B-Instruct and Qwen2.5-Coder-14B-Instruct using the Last-Subquery merging strategy. The query asks for the average price of products that have been ordered. The baseline SQL incorrectly averages all products in Products, ignoring order information. In contrast, the Divide-and-Merge approach decomposes the task, joins Order_Items with Products, and computes the average over ordered products only, yielding the correct result.
  • Figure 5: Success case with Qwen2.5-32B-Instruct and Qwen2.5-Coder-32B-Instruct using the Planner&Executor merging strategy. The query asks for the most common affiliation among city channels. The baseline SQL outputs both the affiliation and its count, adding extra information. The Divide-and-Merge approach refines the output to return only the affiliation, exactly matching the query intent.
  • ...and 3 more figures