Table of Contents
Fetching ...

KeyInst: Keyword Instruction for Improving SQL Formulation in Text-to-SQL

Xiping Liu, Zhao Tan

TL;DR

This research introduces Keyword Instruction (KeyInst), a novel method designed to enhance SQL formulation by Large Language Models (LLMs) that provides guidance on pivotal SQL keywords likely to be part of the final query, thus facilitates a smoother SQL query formulation process.

Abstract

Text-to-SQL parsing involves the translation of natural language queries (NLQs) into their corresponding SQL commands. A principal challenge within this domain is the formulation of SQL queries that are not only syntactically correct but also semantically aligned with the natural language input. However, the intrinsic disparity between the NLQ and the SQL poses a significant challenge. In this research, we introduce Keyword Instruction (KeyInst), a novel method designed to enhance SQL formulation by Large Language Models (LLMs). KeyInst essentially provides guidance on pivotal SQL keywords likely to be part of the final query, thus facilitates a smoother SQL query formulation process. We explore two strategies for integrating KeyInst into Text-to-SQL parsing: a pipeline strategy and a single-pass strategy. The former first generates KeyInst for question, which are then used to prompt LLMs. The latter employs a fine-tuned model to concurrently generate KeyInst and SQL in one step. We developed StrucQL, a benchmark specifically designed for the evaluation of SQL formulation. Extensive experiments on StrucQL and other benchmarks demonstrate that KeyInst significantly improves upon the existing Text-to-SQL prompting techniques.

KeyInst: Keyword Instruction for Improving SQL Formulation in Text-to-SQL

TL;DR

This research introduces Keyword Instruction (KeyInst), a novel method designed to enhance SQL formulation by Large Language Models (LLMs) that provides guidance on pivotal SQL keywords likely to be part of the final query, thus facilitates a smoother SQL query formulation process.

Abstract

Text-to-SQL parsing involves the translation of natural language queries (NLQs) into their corresponding SQL commands. A principal challenge within this domain is the formulation of SQL queries that are not only syntactically correct but also semantically aligned with the natural language input. However, the intrinsic disparity between the NLQ and the SQL poses a significant challenge. In this research, we introduce Keyword Instruction (KeyInst), a novel method designed to enhance SQL formulation by Large Language Models (LLMs). KeyInst essentially provides guidance on pivotal SQL keywords likely to be part of the final query, thus facilitates a smoother SQL query formulation process. We explore two strategies for integrating KeyInst into Text-to-SQL parsing: a pipeline strategy and a single-pass strategy. The former first generates KeyInst for question, which are then used to prompt LLMs. The latter employs a fine-tuned model to concurrently generate KeyInst and SQL in one step. We developed StrucQL, a benchmark specifically designed for the evaluation of SQL formulation. Extensive experiments on StrucQL and other benchmarks demonstrate that KeyInst significantly improves upon the existing Text-to-SQL prompting techniques.

Paper Structure

This paper contains 25 sections, 4 equations, 6 figures, 8 tables.

Figures (6)

  • Figure 1: Graphical illustration of KeyInst and its applications: A. An example of schema, question, KeyInst, and SQL, B. The pipeline approach of KeyInst application, C. The single-pass approach of KeyInst application.
  • Figure 2: An example of the KeyInst.
  • Figure 3: Examples of KeyInst-FT and SQL skeleton.
  • Figure 4: An example of Schema-Simplifed question and schema.
  • Figure 5: Examples of KeyInst-FT and KeyInst-ICL.
  • ...and 1 more figures