Table of Contents
Fetching ...

ROUTE: Robust Multitask Tuning and Collaboration for Text-to-SQL

Yang Qin, Chao Chen, Zhihang Fu, Ze Chen, Dezhong Peng, Peng Hu, Jieping Ye

TL;DR

ROUTE tackles the Text-to-SQL challenge for open-source LLMs by combining multitask supervised fine-tuning (MSFT) across Text2SQL, schema linking, noise correction, and continuation writing with a multitask collaboration prompting (MCP) strategy at inference. This design explicitly reduces SQL-generation hallucinations and enhances generalization across multiple models and benchmarks. Through extensive ablations and transferability analyses, the method demonstrates strong results on SPIDER/BIRD and variants, narrowing the gap to closed-model approaches. The work provides a practical, privacy-conscious path to robust Text2SQL with open models and offers insights into multi-task data synthesis and collaborative prompting for complex NLP-to-database tasks.

Abstract

Despite the significant advancements in Text-to-SQL (Text2SQL) facilitated by large language models (LLMs), the latest state-of-the-art techniques are still trapped in the in-context learning of closed-source LLMs (e.g., GPT-4), which limits their applicability in open scenarios. To address this challenge, we propose a novel RObust mUltitask Tuning and collaboration mEthod (ROUTE) to improve the comprehensive capabilities of open-source LLMs for Text2SQL, thereby providing a more practical solution. Our approach begins with multi-task supervised fine-tuning (SFT) using various synthetic training data related to SQL generation. Unlike existing SFT-based Text2SQL methods, we introduced several additional SFT tasks, including schema linking, noise correction, and continuation writing. Engaging in a variety of SQL generation tasks enhances the model's understanding of SQL syntax and improves its ability to generate high-quality SQL queries. Additionally, inspired by the collaborative modes of LLM agents, we introduce a Multitask Collaboration Prompting (MCP) strategy. This strategy leverages collaboration across several SQL-related tasks to reduce hallucinations during SQL generation, thereby maximizing the potential of enhancing Text2SQL performance through explicit multitask capabilities. Extensive experiments and in-depth analyses have been performed on eight open-source LLMs and five widely-used benchmarks. The results demonstrate that our proposal outperforms the latest Text2SQL methods and yields leading performance.

ROUTE: Robust Multitask Tuning and Collaboration for Text-to-SQL

TL;DR

ROUTE tackles the Text-to-SQL challenge for open-source LLMs by combining multitask supervised fine-tuning (MSFT) across Text2SQL, schema linking, noise correction, and continuation writing with a multitask collaboration prompting (MCP) strategy at inference. This design explicitly reduces SQL-generation hallucinations and enhances generalization across multiple models and benchmarks. Through extensive ablations and transferability analyses, the method demonstrates strong results on SPIDER/BIRD and variants, narrowing the gap to closed-model approaches. The work provides a practical, privacy-conscious path to robust Text2SQL with open models and offers insights into multi-task data synthesis and collaborative prompting for complex NLP-to-database tasks.

Abstract

Despite the significant advancements in Text-to-SQL (Text2SQL) facilitated by large language models (LLMs), the latest state-of-the-art techniques are still trapped in the in-context learning of closed-source LLMs (e.g., GPT-4), which limits their applicability in open scenarios. To address this challenge, we propose a novel RObust mUltitask Tuning and collaboration mEthod (ROUTE) to improve the comprehensive capabilities of open-source LLMs for Text2SQL, thereby providing a more practical solution. Our approach begins with multi-task supervised fine-tuning (SFT) using various synthetic training data related to SQL generation. Unlike existing SFT-based Text2SQL methods, we introduced several additional SFT tasks, including schema linking, noise correction, and continuation writing. Engaging in a variety of SQL generation tasks enhances the model's understanding of SQL syntax and improves its ability to generate high-quality SQL queries. Additionally, inspired by the collaborative modes of LLM agents, we introduce a Multitask Collaboration Prompting (MCP) strategy. This strategy leverages collaboration across several SQL-related tasks to reduce hallucinations during SQL generation, thereby maximizing the potential of enhancing Text2SQL performance through explicit multitask capabilities. Extensive experiments and in-depth analyses have been performed on eight open-source LLMs and five widely-used benchmarks. The results demonstrate that our proposal outperforms the latest Text2SQL methods and yields leading performance.

Paper Structure

This paper contains 26 sections, 2 equations, 9 figures, 15 tables, 1 algorithm.

Figures (9)

  • Figure 1: Overall framework of our R OUTE. Our approach consists of two core stages, i.e., Multitask Supervised Fine-tuning (MSFT) and Multitask Collaboration Prompting (MCP). Different from existing methods that only focus on unitask learning in a single LLM, our MSFT aims to empower LLMs to handle multiple SQL-specific tasks by utilizing synthetic data for supervised fine-tuning. MCP mainly leverages the capabilities of individual tasks for a given database and question to collaboratively generate accurate SQL queries. It enhances the final SQL query incrementally through a three-step process that leverages the multitasking capabilities of LLMs. Note that TS is the abbreviation of Text2SQL, and the task definitions of SL, NC, and CW can be found in \ref{['3.1']}.
  • Figure 2: Illustration of all SQL-related tasks involved in our R OUTE. These tasks are based on a given database and user question. We form the prompt for each task by extracting a schema (table) description, a few rows of examples from the database, and the given question (possibly with a hint).
  • Figure 3: The examples of noisy pairs in the BIRD training set. R1 and R2 are the corrected SQL queries. More noisy examples can be found in \ref{['a_4']}.
  • Figure 4: The transferability results on different open-source LLMs on SPIDER (the first row) and BIRD (the second row). See \ref{['a_tran']} for detailed results.
  • Figure 5: The prompt of Text-to-SQL.
  • ...and 4 more figures