Table of Contents
Fetching ...

SQLCritic: Correcting Text-to-SQL Generation via Clause-wise Critic

Jikai Chen, Leilei Gan, Ziyu Zhao, Zechuan Wang, Dong Wang, Chenyi Zhuang

TL;DR

A clause-wise critique generation task along with a benchmark, SQLCriticBench, which performs fine-grained error localization including both syntax and semantic errors at the clause level and an automatically training dataset curation pipeline which annotate clause-wise critique at scale in a cost-effective way.

Abstract

Existing refinement methods in LLM-based Text-to-SQL systems exhibit limited effectiveness. They often introduce new errors during the self-correction process and fail to detect and correct semantic inaccuracies. To address these gaps, we first introduce a clause-wise critique generation task along with a benchmark, SQLCriticBench, which performs fine-grained error localization including both syntax and semantic errors at the clause level. Furthermore, we introduce a variant of DPO for training our SQLCritic model, where the $β$ coefficient is adaptively changed according to the clause-level inconsistencies between the preferred and dispreferred critiques. We also propose an automatically training dataset curation pipeline which annotate clause-wise critique at scale in a cost-effective way. Experiments demonstrate that the SQLCritic model significantly improves SQL accuracy on the BIRD and Spider datasets, and the results on SQLCriticBench further reveals its superior critique capabilities compared to existing models.

SQLCritic: Correcting Text-to-SQL Generation via Clause-wise Critic

TL;DR

A clause-wise critique generation task along with a benchmark, SQLCriticBench, which performs fine-grained error localization including both syntax and semantic errors at the clause level and an automatically training dataset curation pipeline which annotate clause-wise critique at scale in a cost-effective way.

Abstract

Existing refinement methods in LLM-based Text-to-SQL systems exhibit limited effectiveness. They often introduce new errors during the self-correction process and fail to detect and correct semantic inaccuracies. To address these gaps, we first introduce a clause-wise critique generation task along with a benchmark, SQLCriticBench, which performs fine-grained error localization including both syntax and semantic errors at the clause level. Furthermore, we introduce a variant of DPO for training our SQLCritic model, where the coefficient is adaptively changed according to the clause-level inconsistencies between the preferred and dispreferred critiques. We also propose an automatically training dataset curation pipeline which annotate clause-wise critique at scale in a cost-effective way. Experiments demonstrate that the SQLCritic model significantly improves SQL accuracy on the BIRD and Spider datasets, and the results on SQLCriticBench further reveals its superior critique capabilities compared to existing models.

Paper Structure

This paper contains 29 sections, 5 equations, 4 figures, 9 tables.

Figures (4)

  • Figure 1: Clause-wise critique in the Text-to-SQL critique and correction task.
  • Figure 2: Overview of SQLCriticBench framework and SQLCritic training process (a1-a2: SQLCriticBench construction process, b: SQLCritic training process).
  • Figure 3: Illustration of filtering low-quality generated SQL query.
  • Figure 4: Error analysis distribution. Different colors represent the distribution of different clauses.