C3: Zero-shot Text-to-SQL with ChatGPT
Xuemei Dong, Chao Zhang, Yuhang Ge, Yuren Mao, Yunjun Gao, lu Chen, Jinshu Lin, Dongfang Lou
TL;DR
This work tackles zero-shot Text-to-SQL by leveraging ChatGPT through the C3 framework, which combines Clear Prompting, Calibration with Hints, and Consistent Output to produce reliable SQL without demonstrations. It demonstrates that careful prompt design, bias calibration, and execution-based self-consistency yield state-of-the-art zero-shot performance on the Spider dataset, surpassing fine-tuning baselines while using significantly fewer tokens. The approach includes systematic schema linking, bias-aware prompting, and multiple-sample voting to mitigate LLM randomness, with extensive ablations and error analysis confirming the contributions. The findings suggest a practical, budget-friendly direction for GPT-based Text-to-SQL research with strong potential for broader deployment.
Abstract
This paper proposes a ChatGPT-based zero-shot Text-to-SQL method, dubbed C3, which achieves 82.3\% in terms of execution accuracy on the holdout test set of Spider and becomes the state-of-the-art zero-shot Text-to-SQL method on the Spider Challenge. C3 consists of three key components: Clear Prompting (CP), Calibration with Hints (CH), and Consistent Output (CO), which are corresponding to the model input, model bias and model output respectively. It provides a systematic treatment for zero-shot Text-to-SQL. Extensive experiments have been conducted to verify the effectiveness and efficiency of our proposed method.
