Table of Contents
Fetching ...

SPFT-SQL: Enhancing Large Language Model for Text-to-SQL Parsing by Self-Play Fine-Tuning

Yuhao Zhang, Shaoming Duan, Jinhang Su, Chuanyi Liu, Peiyi Han

TL;DR

SPFT-SQL addresses the limitations of self-play in Text-to-SQL by introducing a verification-based iterative fine-tuning stage to generate high-quality, schema-informed data and an error-driven self-play loss that encourages the main model to outperform erroneous outputs from its opponent. The approach alternates between data augmentation and adversarial training, enabling open-source LLMs to reach competitive performance without relying on closed-source data. Empirical results across five SPIDER-family benchmarks and six open-source models show SPFT-SQL outperforming SPIN and many SOTA SFT-based methods, with notable gains in execution accuracy and data efficiency. The work demonstrates the practicality of open-source Text-to-SQL systems achieving close-to-closed-source performance, while highlighting computation and data-quality considerations for future improvements.

Abstract

Despite the significant advancements of self-play fine-tuning (SPIN), which can transform a weak large language model (LLM) into a strong one through competitive interactions between models of varying capabilities, it still faces challenges in the Text-to-SQL task. SPIN does not generate new information, and the large number of correct SQL queries produced by the opponent model during self-play reduces the main model's ability to generate accurate SQL queries. To address this challenge, we propose a new self-play fine-tuning method tailored for the Text-to-SQL task, called SPFT-SQL. Prior to self-play, we introduce a verification-based iterative fine-tuning approach, which synthesizes high-quality fine-tuning data iteratively based on the database schema and validation feedback to enhance model performance, while building a model base with varying capabilities. During the self-play fine-tuning phase, we propose an error-driven loss method that incentivizes incorrect outputs from the opponent model, enabling the main model to distinguish between correct SQL and erroneous SQL generated by the opponent model, thereby improving its ability to generate correct SQL. Extensive experiments and in-depth analyses on six open-source LLMs and five widely used benchmarks demonstrate that our approach outperforms existing state-of-the-art (SOTA) methods.

SPFT-SQL: Enhancing Large Language Model for Text-to-SQL Parsing by Self-Play Fine-Tuning

TL;DR

SPFT-SQL addresses the limitations of self-play in Text-to-SQL by introducing a verification-based iterative fine-tuning stage to generate high-quality, schema-informed data and an error-driven self-play loss that encourages the main model to outperform erroneous outputs from its opponent. The approach alternates between data augmentation and adversarial training, enabling open-source LLMs to reach competitive performance without relying on closed-source data. Empirical results across five SPIDER-family benchmarks and six open-source models show SPFT-SQL outperforming SPIN and many SOTA SFT-based methods, with notable gains in execution accuracy and data efficiency. The work demonstrates the practicality of open-source Text-to-SQL systems achieving close-to-closed-source performance, while highlighting computation and data-quality considerations for future improvements.

Abstract

Despite the significant advancements of self-play fine-tuning (SPIN), which can transform a weak large language model (LLM) into a strong one through competitive interactions between models of varying capabilities, it still faces challenges in the Text-to-SQL task. SPIN does not generate new information, and the large number of correct SQL queries produced by the opponent model during self-play reduces the main model's ability to generate accurate SQL queries. To address this challenge, we propose a new self-play fine-tuning method tailored for the Text-to-SQL task, called SPFT-SQL. Prior to self-play, we introduce a verification-based iterative fine-tuning approach, which synthesizes high-quality fine-tuning data iteratively based on the database schema and validation feedback to enhance model performance, while building a model base with varying capabilities. During the self-play fine-tuning phase, we propose an error-driven loss method that incentivizes incorrect outputs from the opponent model, enabling the main model to distinguish between correct SQL and erroneous SQL generated by the opponent model, thereby improving its ability to generate correct SQL. Extensive experiments and in-depth analyses on six open-source LLMs and five widely used benchmarks demonstrate that our approach outperforms existing state-of-the-art (SOTA) methods.

Paper Structure

This paper contains 46 sections, 9 equations, 6 figures, 16 tables, 1 algorithm.

Figures (6)

  • Figure 1: Comparison results on the Spider yu2018spider and BIRDli2024can dataset, the base model of SFT, SPIN chen2024selfplay, and SPFT-SQL is Qwen2.5-Coder 7B.
  • Figure 2: An overview of SPFT-SQL framework.
  • Figure 3: Comparison of Different Iteration
  • Figure 4: Performance on varing number of synthetic data each round.
  • Figure 5: Performance of LLMs fine-tuned on synthetic data generated by SPFT-SQL and baselines.
  • ...and 1 more figures