Exploring Chain-of-Thought Style Prompting for Text-to-SQL
Chang-You Tai, Ziru Chen, Tianshu Zhang, Xiang Deng, Huan Sun
TL;DR
This paper investigates the effectiveness of chain-of-thought style prompting for text-to-SQL parsing. It compares chain-of-thought prompting, least-to-most prompting, and introduces a novel question decomposition prompting (QDecomp) with a schema-grounding variant (QDecomp+InterCOL). The results show that iterative prompting is often unnecessary and that overly detailed reasoning steps can cause error propagation, with QDecomp+InterCOL delivering the strongest gains on Spider and Spider Realistic datasets and competitive performance on larger prompts. The work highlights robustness to in-context example design and prompts directions for applying multi-step reasoning in semantic parsing, suggesting future exploration across more LLMs and interactive grounding frameworks.
Abstract
In-context learning with large language models (LLMs) has recently caught increasing attention due to its superior few-shot performance on various tasks. However, its performance on text-to-SQL parsing still has much room for improvement. In this paper, we hypothesize that a crucial aspect of LLMs to improve for text-to-SQL parsing is their multi-step reasoning ability. Thus, we systematically study how to enhance LLMs' reasoning ability through chain of thought (CoT) style prompting, including the original chain-of-thought prompting (Wei et al., 2022b) and least-to-most prompting (Zhou et al., 2023). Our experiments demonstrate that iterative prompting as in Zhou et al. (2023) may be unnecessary for text-to-SQL parsing, and using detailed reasoning steps tends to have more error propagation issues. Based on these findings, we propose a new CoT-style prompting method for text-to-SQL parsing. It brings 5.2 and 6.5 point absolute gains on the Spider development set and the Spider Realistic set, respectively, compared to the standard prompting method without reasoning steps; 2.4 and 1.5 point absolute gains, compared to the least-to-most prompting method.
