Knowledge Distillation with Structured Chain-of-Thought for Text-to-SQL
Khushboo Thaker, Yony Bresler
TL;DR
This work tackles private, cost-efficient Text-to-SQL by transferring structured reasoning from a high-capacity teacher to a smaller student. It introduces Struct-SQL, a knowledge distillation framework that uses a Query Execution Plan-based CoT (QP-CoT) as the structured teaching signal, enabling the student to learn a formal, stepwise execution blueprint alongside the final SQL. On the BIRD mini-dev benchmark, Struct-SQL substantially outperforms unstructured KD baselines (ReasonSQL) and a gold-FN finetuning approach, primarily by reducing syntactic errors and improving logical reliability, while remaining suitable for private deployment through PEFT (QLoRA). The results underscore the value of structured reasoning signals for distilling complex tasks to resource-constrained models and point toward broader applications beyond Text-to-SQL.
Abstract
Deploying accurate Text-to-SQL systems at the enterprise level faces a difficult trilemma involving cost, security and performance. Current solutions force enterprises to choose between expensive, proprietary Large Language Models (LLMs) and low-performing Small Language Models (SLMs). Efforts to improve SLMs often rely on distilling reasoning from large LLMs using unstructured Chain-of-Thought (CoT) traces, a process that remains inherently ambiguous. Instead, we hypothesize that a formal, structured reasoning representation provides a clearer, more reliable teaching signal, as the Text-to-SQL task requires explicit and precise logical steps. To evaluate this hypothesis, we propose Struct-SQL, a novel Knowledge Distillation (KD) framework that trains an SLM to emulate a powerful large LLM. Consequently, we adopt a query execution plan as a formal blueprint to derive this structured reasoning. Our SLM, distilled with structured CoT, achieves an absolute improvement of 8.1% over an unstructured CoT distillation baseline. A detailed error analysis reveals that a key factor in this gain is a marked reduction in syntactic errors. This demonstrates that teaching a model to reason using a structured logical blueprint is beneficial for reliable SQL generation in SLMs.
