Table of Contents
Fetching ...

Knowledge Distillation with Structured Chain-of-Thought for Text-to-SQL

Khushboo Thaker, Yony Bresler

TL;DR

This work tackles private, cost-efficient Text-to-SQL by transferring structured reasoning from a high-capacity teacher to a smaller student. It introduces Struct-SQL, a knowledge distillation framework that uses a Query Execution Plan-based CoT (QP-CoT) as the structured teaching signal, enabling the student to learn a formal, stepwise execution blueprint alongside the final SQL. On the BIRD mini-dev benchmark, Struct-SQL substantially outperforms unstructured KD baselines (ReasonSQL) and a gold-FN finetuning approach, primarily by reducing syntactic errors and improving logical reliability, while remaining suitable for private deployment through PEFT (QLoRA). The results underscore the value of structured reasoning signals for distilling complex tasks to resource-constrained models and point toward broader applications beyond Text-to-SQL.

Abstract

Deploying accurate Text-to-SQL systems at the enterprise level faces a difficult trilemma involving cost, security and performance. Current solutions force enterprises to choose between expensive, proprietary Large Language Models (LLMs) and low-performing Small Language Models (SLMs). Efforts to improve SLMs often rely on distilling reasoning from large LLMs using unstructured Chain-of-Thought (CoT) traces, a process that remains inherently ambiguous. Instead, we hypothesize that a formal, structured reasoning representation provides a clearer, more reliable teaching signal, as the Text-to-SQL task requires explicit and precise logical steps. To evaluate this hypothesis, we propose Struct-SQL, a novel Knowledge Distillation (KD) framework that trains an SLM to emulate a powerful large LLM. Consequently, we adopt a query execution plan as a formal blueprint to derive this structured reasoning. Our SLM, distilled with structured CoT, achieves an absolute improvement of 8.1% over an unstructured CoT distillation baseline. A detailed error analysis reveals that a key factor in this gain is a marked reduction in syntactic errors. This demonstrates that teaching a model to reason using a structured logical blueprint is beneficial for reliable SQL generation in SLMs.

Knowledge Distillation with Structured Chain-of-Thought for Text-to-SQL

TL;DR

This work tackles private, cost-efficient Text-to-SQL by transferring structured reasoning from a high-capacity teacher to a smaller student. It introduces Struct-SQL, a knowledge distillation framework that uses a Query Execution Plan-based CoT (QP-CoT) as the structured teaching signal, enabling the student to learn a formal, stepwise execution blueprint alongside the final SQL. On the BIRD mini-dev benchmark, Struct-SQL substantially outperforms unstructured KD baselines (ReasonSQL) and a gold-FN finetuning approach, primarily by reducing syntactic errors and improving logical reliability, while remaining suitable for private deployment through PEFT (QLoRA). The results underscore the value of structured reasoning signals for distilling complex tasks to resource-constrained models and point toward broader applications beyond Text-to-SQL.

Abstract

Deploying accurate Text-to-SQL systems at the enterprise level faces a difficult trilemma involving cost, security and performance. Current solutions force enterprises to choose between expensive, proprietary Large Language Models (LLMs) and low-performing Small Language Models (SLMs). Efforts to improve SLMs often rely on distilling reasoning from large LLMs using unstructured Chain-of-Thought (CoT) traces, a process that remains inherently ambiguous. Instead, we hypothesize that a formal, structured reasoning representation provides a clearer, more reliable teaching signal, as the Text-to-SQL task requires explicit and precise logical steps. To evaluate this hypothesis, we propose Struct-SQL, a novel Knowledge Distillation (KD) framework that trains an SLM to emulate a powerful large LLM. Consequently, we adopt a query execution plan as a formal blueprint to derive this structured reasoning. Our SLM, distilled with structured CoT, achieves an absolute improvement of 8.1% over an unstructured CoT distillation baseline. A detailed error analysis reveals that a key factor in this gain is a marked reduction in syntactic errors. This demonstrates that teaching a model to reason using a structured logical blueprint is beneficial for reliable SQL generation in SLMs.

Paper Structure

This paper contains 20 sections, 1 equation, 4 figures, 4 tables.

Figures (4)

  • Figure 1: Unstructured vs. Structured Reasoning Distillation. The figure contrasts the two methods of reasoning distillation: (Top) Unstructured Distillation (ReasonSQL), which relies on a free-form CoT prompt, with (Bottom) the proposed Structured Distillation (Struct-SQL), which uses a QP-CoT prompt to generate a structured logical blueprint. The Teacher Model's output serves as the supervisory signal for tuning the Student Models.
  • Figure 2: Sample structured query plan for the input "State the most popular movie? When was it released and who is the director for the movie?"
  • Figure 3: Compared to the Teacher Model (a), both the Student Model (b) and the FN-Gold (c) exhibit substantially lower performance, primarily due to high syntactic errors. The unstructured distillation baseline ReasonSQL (d) improves upon both the Student Model and FN-Gold. Struct-SQL (e) achieves the highest success rate among all tuned student models.
  • Figure 4: Detailed performance analysis. (a) Execution Accuracy across different SQL constructs, highlighting Struct-SQL's proficiency in handling complex aggregations. (b) Gains vs. Losses analysis for baseline models relative to the Teacher model. The losses and gains are correlated with overall performance. (c) Performance State Transitions illustrates Struct-SQL's effectiveness in converting severe errors (SYN, GEN) from the Student Model into direct successes or less severe errors (SEM).