MSc-SQL: Multi-Sample Critiquing Small Language Models For Text-To-SQL Translation
Satya Krishna Gorti, Ilan Gofman, Zhaoyan Liu, Jiapeng Wu, Noël Vouitsis, Guangwei Yu, Jesse C. Cresswell, Rasa Hosseinzadeh
TL;DR
The paper addresses the need for accessible, privacy-conscious text-to-SQL systems without relying on closed models. It introduces MSc-SQL, a pipeline that combines schema linking, retrieval-augmented SQL generation, and multi-sample critiquing to select the best among several candidate queries from small open-source LLMs. The critiquing component jointly reasons over multiple samples and their execution results, enabling competitive performance on the Spider and BIRD benchmarks at a fraction of GPT-4-based costs. Extensive ablations show that sample diversity and QLoRA-based fine-tuning are key to achieving strong results, with practical implications for latency-sensitive and privacy-preserving applications.
Abstract
Text-to-SQL generation enables non-experts to interact with databases via natural language. Recent advances rely on large closed-source models like GPT-4 that present challenges in accessibility, privacy, and latency. To address these issues, we focus on developing small, efficient, and open-source text-to-SQL models. We demonstrate the benefits of sampling multiple candidate SQL generations and propose our method, MSc-SQL, to critique them using associated metadata. Our sample critiquing model evaluates multiple outputs simultaneously, achieving state-of-the-art performance compared to other open-source models while remaining competitive with larger models at a much lower cost. Full code can be found at https://github.com/layer6ai-labs/msc-sql.
