Table of Contents
Fetching ...

Exploring Chain-of-Thought Style Prompting for Text-to-SQL

Chang-You Tai, Ziru Chen, Tianshu Zhang, Xiang Deng, Huan Sun

TL;DR

This paper investigates the effectiveness of chain-of-thought style prompting for text-to-SQL parsing. It compares chain-of-thought prompting, least-to-most prompting, and introduces a novel question decomposition prompting (QDecomp) with a schema-grounding variant (QDecomp+InterCOL). The results show that iterative prompting is often unnecessary and that overly detailed reasoning steps can cause error propagation, with QDecomp+InterCOL delivering the strongest gains on Spider and Spider Realistic datasets and competitive performance on larger prompts. The work highlights robustness to in-context example design and prompts directions for applying multi-step reasoning in semantic parsing, suggesting future exploration across more LLMs and interactive grounding frameworks.

Abstract

In-context learning with large language models (LLMs) has recently caught increasing attention due to its superior few-shot performance on various tasks. However, its performance on text-to-SQL parsing still has much room for improvement. In this paper, we hypothesize that a crucial aspect of LLMs to improve for text-to-SQL parsing is their multi-step reasoning ability. Thus, we systematically study how to enhance LLMs' reasoning ability through chain of thought (CoT) style prompting, including the original chain-of-thought prompting (Wei et al., 2022b) and least-to-most prompting (Zhou et al., 2023). Our experiments demonstrate that iterative prompting as in Zhou et al. (2023) may be unnecessary for text-to-SQL parsing, and using detailed reasoning steps tends to have more error propagation issues. Based on these findings, we propose a new CoT-style prompting method for text-to-SQL parsing. It brings 5.2 and 6.5 point absolute gains on the Spider development set and the Spider Realistic set, respectively, compared to the standard prompting method without reasoning steps; 2.4 and 1.5 point absolute gains, compared to the least-to-most prompting method.

Exploring Chain-of-Thought Style Prompting for Text-to-SQL

TL;DR

This paper investigates the effectiveness of chain-of-thought style prompting for text-to-SQL parsing. It compares chain-of-thought prompting, least-to-most prompting, and introduces a novel question decomposition prompting (QDecomp) with a schema-grounding variant (QDecomp+InterCOL). The results show that iterative prompting is often unnecessary and that overly detailed reasoning steps can cause error propagation, with QDecomp+InterCOL delivering the strongest gains on Spider and Spider Realistic datasets and competitive performance on larger prompts. The work highlights robustness to in-context example design and prompts directions for applying multi-step reasoning in semantic parsing, suggesting future exploration across more LLMs and interactive grounding frameworks.

Abstract

In-context learning with large language models (LLMs) has recently caught increasing attention due to its superior few-shot performance on various tasks. However, its performance on text-to-SQL parsing still has much room for improvement. In this paper, we hypothesize that a crucial aspect of LLMs to improve for text-to-SQL parsing is their multi-step reasoning ability. Thus, we systematically study how to enhance LLMs' reasoning ability through chain of thought (CoT) style prompting, including the original chain-of-thought prompting (Wei et al., 2022b) and least-to-most prompting (Zhou et al., 2023). Our experiments demonstrate that iterative prompting as in Zhou et al. (2023) may be unnecessary for text-to-SQL parsing, and using detailed reasoning steps tends to have more error propagation issues. Based on these findings, we propose a new CoT-style prompting method for text-to-SQL parsing. It brings 5.2 and 6.5 point absolute gains on the Spider development set and the Spider Realistic set, respectively, compared to the standard prompting method without reasoning steps; 2.4 and 1.5 point absolute gains, compared to the least-to-most prompting method.
Paper Structure (27 sections, 9 figures, 9 tables)

This paper contains 27 sections, 9 figures, 9 tables.

Figures (9)

  • Figure 1: Example model input and expected outputs for four CoT style prompting methods applied to text-to-SQL parsing: A. Chain-of-Thought, B. Least-to-Most, C. QDecomp, and D. QDecomp + InterCOL, where C and D are our proposed methods.
  • Figure 2: An example for API docs prompt format, introduced by 2022arXiv220400498R, on Spider.
  • Figure 3: An example for Create Table + Select 3 prompt format, introduced by 2022arXiv220400498R, on Spider.
  • Figure 4: An example prompt under the standard API docs prompting for 2-shot on Spider.
  • Figure 5: An example prompt under chain-of-thought + API docs prompting for 1-shot on Spider.
  • ...and 4 more figures