Table of Contents
Fetching ...

LearNAT: Learning NL2SQL with AST-guided Task Decomposition for Large Language Models

Weibin Liao, Xin Gao, Tianyu Jia, Rihong Qiu, Yifan Zhu, Yang Lin, Xu Chu, Junfeng Zhao, Yasha Wang

TL;DR

LearNAT tackles the challenge of open-source LLMs performing complex NL2SQL by introducing AST-guided task decomposition, Margin-Aware Reinforcement Learning, and Adaptive Demonstration Reasoning. The Decomposition Synthesis Procedure uses AST-guided MCTS with pruning to generate subtasks, while MDPO with an AST-based margin refines multi-step reasoning, and ADR selects highly relevant demonstrations. Across Spider and BIRD benchmarks, a 7B open-source LLM matches GPT-4-level performance with improved efficiency and accessibility, significantly narrowing the gap to closed-source systems. The work highlights the critical role of structured representations (ASTs) and fine-grained, demonstration-informed optimization in enabling affordable, capable NL2SQL systems for real-world use.

Abstract

Natural Language to SQL (NL2SQL) has emerged as a critical task for enabling seamless interaction with databases. Recent advancements in Large Language Models (LLMs) have demonstrated remarkable performance in this domain. However, existing NL2SQL methods predominantly rely on closed-source LLMs leveraging prompt engineering, while open-source models typically require fine-tuning to acquire domain-specific knowledge. Despite these efforts, open-source LLMs struggle with complex NL2SQL tasks due to the indirect expression of user query objectives and the semantic gap between user queries and database schemas. Inspired by the application of reinforcement learning in mathematical problem-solving to encourage step-by-step reasoning in LLMs, we propose LearNAT (Learning NL2SQL with AST-guided Task Decomposition), a novel framework that improves the performance of open-source LLMs on complex NL2SQL tasks through task decomposition and reinforcement learning. LearNAT introduces three key components: (1) a Decomposition Synthesis Procedure that leverages Abstract Syntax Trees (ASTs) to guide efficient search and pruning strategies for task decomposition, (2) Margin-aware Reinforcement Learning, which employs fine-grained step-level optimization via DPO with AST margins, and (3) Adaptive Demonstration Reasoning, a mechanism for dynamically selecting relevant examples to enhance decomposition capabilities. Extensive experiments on two benchmark datasets, Spider and BIRD, demonstrate that LearNAT enables a 7B-parameter open-source LLM to achieve performance comparable to GPT-4, while offering improved efficiency and accessibility.

LearNAT: Learning NL2SQL with AST-guided Task Decomposition for Large Language Models

TL;DR

LearNAT tackles the challenge of open-source LLMs performing complex NL2SQL by introducing AST-guided task decomposition, Margin-Aware Reinforcement Learning, and Adaptive Demonstration Reasoning. The Decomposition Synthesis Procedure uses AST-guided MCTS with pruning to generate subtasks, while MDPO with an AST-based margin refines multi-step reasoning, and ADR selects highly relevant demonstrations. Across Spider and BIRD benchmarks, a 7B open-source LLM matches GPT-4-level performance with improved efficiency and accessibility, significantly narrowing the gap to closed-source systems. The work highlights the critical role of structured representations (ASTs) and fine-grained, demonstration-informed optimization in enabling affordable, capable NL2SQL systems for real-world use.

Abstract

Natural Language to SQL (NL2SQL) has emerged as a critical task for enabling seamless interaction with databases. Recent advancements in Large Language Models (LLMs) have demonstrated remarkable performance in this domain. However, existing NL2SQL methods predominantly rely on closed-source LLMs leveraging prompt engineering, while open-source models typically require fine-tuning to acquire domain-specific knowledge. Despite these efforts, open-source LLMs struggle with complex NL2SQL tasks due to the indirect expression of user query objectives and the semantic gap between user queries and database schemas. Inspired by the application of reinforcement learning in mathematical problem-solving to encourage step-by-step reasoning in LLMs, we propose LearNAT (Learning NL2SQL with AST-guided Task Decomposition), a novel framework that improves the performance of open-source LLMs on complex NL2SQL tasks through task decomposition and reinforcement learning. LearNAT introduces three key components: (1) a Decomposition Synthesis Procedure that leverages Abstract Syntax Trees (ASTs) to guide efficient search and pruning strategies for task decomposition, (2) Margin-aware Reinforcement Learning, which employs fine-grained step-level optimization via DPO with AST margins, and (3) Adaptive Demonstration Reasoning, a mechanism for dynamically selecting relevant examples to enhance decomposition capabilities. Extensive experiments on two benchmark datasets, Spider and BIRD, demonstrate that LearNAT enables a 7B-parameter open-source LLM to achieve performance comparable to GPT-4, while offering improved efficiency and accessibility.

Paper Structure

This paper contains 19 sections, 17 equations, 7 figures, 5 tables.

Figures (7)

  • Figure 1: (a) illustrates the LLM directly solving a complex NL2SQL task, resulting in an incorrect output. (b) shows the LLM solving multiple decomposed simple NL2SQL subtasks from the same task in (a), resulting in a correct output.
  • Figure 2: The abstract syntax tree (AST) of the given case in Fig. \ref{['fig:teaser']}. Each simple NL2SQL subtask in Fig. \ref{['fig:teaser']} corresponds to a subtree within the AST. Clause nodes, operator nodes and operand nodes were defined in Sec. \ref{['ssec:AST']}.
  • Figure 3: A preliminary experiment was conducted. We randomly selected 500 cases from the BIRD Train dataset and employed QWen-2.5-Coder to perform the NL2SQL task.
  • Figure 4: Overview of LearNAT. LearNAT designs methodologies for three key processes of RL on LLMs: Training Data Synthesis, Model Training, and Model Inference. Correspondingly, LearNAT proposes Decomposition Synthesis Procedure, Margin-Aware Reinforcement Learning, and Adaptive Demonstration Reasoning.
  • Figure 5: Framework of the Decomposition Synthesis Procedure. (c) illustrates how the LLM, combined with MCTS, performs next-step prediction to synthesize subtasks of complex NL2SQL tasks. (b) presents the AST of the SQL statements corresponding to each synthesized subtask in (c). (a) shows the AST of the Gold SQL for the complex NL2SQL task, which guides the MCTS in (c) to perform more efficient search, including pruning and node reward estimation. (d) depicts the data collected by LearNAT during the Decomposition Synthesis Procedure, comprising successful trajectories data for supervised fine-tuning and step-wise contrastive action pairs data for preference learning. Under the default settings of LearNAT, GLM-4-Plus is used to synthesize decomposition data, and the Qwen2.5-Coder model is fine-tuned.
  • ...and 2 more figures