A Survey on Employing Large Language Models for Text-to-SQL Tasks
Liang Shi, Zhengju Tang, Nan Zhang, Xiaotong Zhang, Zhi Yang
TL;DR
The paper surveys the rise of large language models for text-to-SQL, distinguishing prompt-engineering and finetuning as the two main strategies. It catalogs benchmarks and metrics, analyzes prompt designs, schema linking, and reasoning workflows, and reviews base-model choices (open vs closed) and training data. Key contributions include a systematic taxonomy of LLM-based Text-to-SQL pipelines, a synthesis of benchmarking studies, and a forward-looking discussion on privacy, domain knowledge, and autonomous agents. The work highlights practical pathways and challenges for deploying LLM-driven Text-to-SQL in real-world, enterprise-scale databases.
Abstract
With the development of the Large Language Models (LLMs), a large range of LLM-based Text-to-SQL(Text2SQL) methods have emerged. This survey provides a comprehensive review of LLM-based Text2SQL studies. We first enumerate classic benchmarks and evaluation metrics. For the two mainstream methods, prompt engineering and finetuning, we introduce a comprehensive taxonomy and offer practical insights into each subcategory. We present an overall analysis of the above methods and various models evaluated on well-known datasets and extract some characteristics. Finally, we discuss the challenges and future directions in this field.
