Solid-SQL: Enhanced Schema-linking based In-context Learning for Robust Text-to-SQL
Geling Liu, Yunzhi Tan, Ruichao Zhong, Yuanzhen Xie, Lingchen Zhao, Qian Wang, Bo Hu, Zang Li
TL;DR
Solid-SQL tackles robustness gaps in LLM-based text-to-SQL by introducing a robust pre-processing pipeline that augments training data, refines schema linking, and employs skeleton-based in-context example retrieval with an explicit attention mechanism. The method is designed as a plug-in that supports multiple LLMs and uses a two-round in-context learning process to stabilize SQL generation under perturbations, formalized as $S = M(Q,SC)$ with robustness requiring $DB(M(Q,SC)) = DB(M(Q^*,SC))$. Key contributions include robust data augmentation for schema linking, a fine-tuned schema-linking model, skeleton-based question and SQL matching strategies, and an attention-guided prompt design, all supported by extensive ablations. Empirically, Solid-SQL achieves state-of-the-art execution accuracy on general benchmarks ($EX$ up to $82.1\%$ on Spider and $58.9\%$ on Bird) and yields an average robustness improvement of $11.6\%$ over baselines on perturbed datasets, demonstrating practical improvements for reliable text-to-SQL in adversarial settings.
Abstract
Recently, large language models (LLMs) have significantly improved the performance of text-to-SQL systems. Nevertheless, many state-of-the-art (SOTA) approaches have overlooked the critical aspect of system robustness. Our experiments reveal that while LLM-driven methods excel on standard datasets, their accuracy is notably compromised when faced with adversarial perturbations. To address this challenge, we propose a robust text-to-SQL solution, called Solid-SQL, designed to integrate with various LLMs. We focus on the pre-processing stage, training a robust schema-linking model enhanced by LLM-based data augmentation. Additionally, we design a two-round, structural similarity-based example retrieval strategy for in-context learning. Our method achieves SOTA SQL execution accuracy levels of 82.1% and 58.9% on the general Spider and Bird benchmarks, respectively. Furthermore, experimental results show that Solid-SQL delivers an average improvement of 11.6% compared to baselines on the perturbed Spider-Syn, Spider-Realistic, and Dr. Spider benchmarks.
