Table of Contents
Fetching ...

Reboost Large Language Model-based Text-to-SQL, Text-to-Python, and Text-to-Function -- with Real Applications in Traffic Domain

Guanghu Sui, Zhishuai Li, Ziyue Li, Sun Yang, Jingqing Ruan, Hangyu Mao, Rui Zhao

TL;DR

This work tackles the gap between research-grade Text-to-SQL performance and real-world business needs by introducing ReBoostSQL, a four-module prompting framework (Query Rewriting, Explain-Squeeze Schema Linking, SQL Generation, SQL Boosting) that explicitly handles domain-specific language and large, column-rich schemas. The approach yields substantial improvements on two traffic-domain databases (CTraffic and CompanyZ) over the previous SOTA DINSQL, achieving an EX of 0.6579 with GPT-4 on CTraffic and demonstrating robust generalization across LLMs. The study also extends the framework to Text2Python and Text2Function, revealing trade-offs: Text2Python can be less efficient due to longer code generation and token demands, while Text2Function can reduce hallucinations through structured templates at the cost of manual integration and potential overfitting. Together, these results offer practical guidance for deploying NL-to-database interfaces in industry and inform future research on tailoring prompting and modularization to domain-specific data and tasks.

Abstract

The previous state-of-the-art (SOTA) method achieved a remarkable execution accuracy on the Spider dataset, which is one of the largest and most diverse datasets in the Text-to-SQL domain. However, during our reproduction of the business dataset, we observed a significant drop in performance. We examined the differences in dataset complexity, as well as the clarity of questions' intentions, and assessed how those differences could impact the performance of prompting methods. Subsequently, We develop a more adaptable and more general prompting method, involving mainly query rewriting and SQL boosting, which respectively transform vague information into exact and precise information and enhance the SQL itself by incorporating execution feedback and the query results from the database content. In order to prevent information gaps, we include the comments, value types, and value samples for columns as part of the database description in the prompt. Our experiments with Large Language Models (LLMs) illustrate the significant performance improvement on the business dataset and prove the substantial potential of our method. In terms of execution accuracy on the business dataset, the SOTA method scored 21.05, while our approach scored 65.79. As a result, our approach achieved a notable performance improvement even when using a less capable pre-trained language model. Last but not least, we also explore the Text-to-Python and Text-to-Function options, and we deeply analyze the pros and cons among them, offering valuable insights to the community.

Reboost Large Language Model-based Text-to-SQL, Text-to-Python, and Text-to-Function -- with Real Applications in Traffic Domain

TL;DR

This work tackles the gap between research-grade Text-to-SQL performance and real-world business needs by introducing ReBoostSQL, a four-module prompting framework (Query Rewriting, Explain-Squeeze Schema Linking, SQL Generation, SQL Boosting) that explicitly handles domain-specific language and large, column-rich schemas. The approach yields substantial improvements on two traffic-domain databases (CTraffic and CompanyZ) over the previous SOTA DINSQL, achieving an EX of 0.6579 with GPT-4 on CTraffic and demonstrating robust generalization across LLMs. The study also extends the framework to Text2Python and Text2Function, revealing trade-offs: Text2Python can be less efficient due to longer code generation and token demands, while Text2Function can reduce hallucinations through structured templates at the cost of manual integration and potential overfitting. Together, these results offer practical guidance for deploying NL-to-database interfaces in industry and inform future research on tailoring prompting and modularization to domain-specific data and tasks.

Abstract

The previous state-of-the-art (SOTA) method achieved a remarkable execution accuracy on the Spider dataset, which is one of the largest and most diverse datasets in the Text-to-SQL domain. However, during our reproduction of the business dataset, we observed a significant drop in performance. We examined the differences in dataset complexity, as well as the clarity of questions' intentions, and assessed how those differences could impact the performance of prompting methods. Subsequently, We develop a more adaptable and more general prompting method, involving mainly query rewriting and SQL boosting, which respectively transform vague information into exact and precise information and enhance the SQL itself by incorporating execution feedback and the query results from the database content. In order to prevent information gaps, we include the comments, value types, and value samples for columns as part of the database description in the prompt. Our experiments with Large Language Models (LLMs) illustrate the significant performance improvement on the business dataset and prove the substantial potential of our method. In terms of execution accuracy on the business dataset, the SOTA method scored 21.05, while our approach scored 65.79. As a result, our approach achieved a notable performance improvement even when using a less capable pre-trained language model. Last but not least, we also explore the Text-to-Python and Text-to-Function options, and we deeply analyze the pros and cons among them, offering valuable insights to the community.
Paper Structure (24 sections, 12 figures, 5 tables)

This paper contains 24 sections, 12 figures, 5 tables.

Figures (12)

  • Figure 1: LLM-based Text-to-SQL model translates NL to database query and gives the proper answer
  • Figure 2: The overview of the proposed method: ReBoostSQL
  • Figure 3: An illustration of Query Rewriting process with prompts of extract, transform and replace
  • Figure 4: The illustration of the explain-squeeze schema linking pipeline.
  • Figure 5: An example showing the process of SQL generation with 'ValueType, Meaning and Sampling' highlighted.
  • ...and 7 more figures