Table of Contents
Fetching ...

Robust Text-to-SQL Generation with Execution-Guided Decoding

Chenglong Wang, Kedar Tatwawadi, Marc Brockschmidt, Po-Sen Huang, Yi Mao, Oleksandr Polozov, Rishabh Singh

TL;DR

This work introduces execution-guided decoding, a test-time mechanism that prunes invalid partial SQL programs by executing them during generation. By applying this approach to four diverse text-to-SQL models across WikiSQL, ATIS, and GeoQuery, the authors achieve consistent improvements and set a new state of the art on WikiSQL with 83.8% execution accuracy. Ablation studies show that fine-grained guidance on conditional generation yields the most benefit, while the method remains simple and broadly applicable across model families. The work highlights a promising direction for combining neural generation with symbolic execution to enforce semantic correctness without retraining.

Abstract

We consider the problem of neural semantic parsing, which translates natural language questions into executable SQL queries. We introduce a new mechanism, execution guidance, to leverage the semantics of SQL. It detects and excludes faulty programs during the decoding procedure by conditioning on the execution of partially generated program. The mechanism can be used with any autoregressive generative model, which we demonstrate on four state-of-the-art recurrent or template-based semantic parsing models. We demonstrate that execution guidance universally improves model performance on various text-to-SQL datasets with different scales and query complexity: WikiSQL, ATIS, and GeoQuery. As a result, we achieve new state-of-the-art execution accuracy of 83.8% on WikiSQL.

Robust Text-to-SQL Generation with Execution-Guided Decoding

TL;DR

This work introduces execution-guided decoding, a test-time mechanism that prunes invalid partial SQL programs by executing them during generation. By applying this approach to four diverse text-to-SQL models across WikiSQL, ATIS, and GeoQuery, the authors achieve consistent improvements and set a new state of the art on WikiSQL with 83.8% execution accuracy. Ablation studies show that fine-grained guidance on conditional generation yields the most benefit, while the method remains simple and broadly applicable across model families. The work highlights a promising direction for combining neural generation with symbolic execution to enforce semantic correctness without retraining.

Abstract

We consider the problem of neural semantic parsing, which translates natural language questions into executable SQL queries. We introduce a new mechanism, execution guidance, to leverage the semantics of SQL. It detects and excludes faulty programs during the decoding procedure by conditioning on the execution of partially generated program. The mechanism can be used with any autoregressive generative model, which we demonstrate on four state-of-the-art recurrent or template-based semantic parsing models. We demonstrate that execution guidance universally improves model performance on various text-to-SQL datasets with different scales and query complexity: WikiSQL, ATIS, and GeoQuery. As a result, we achieve new state-of-the-art execution accuracy of 83.8% on WikiSQL.

Paper Structure

This paper contains 23 sections, 5 figures, 5 tables, 1 algorithm.

Figures (5)

  • Figure 1: An execution-guided decoder evaluates partially generated queries at appropriate timesteps and then excludes those candidates that cannot be completed to a correct SQL query (red background). Here, "opponent > Haugar" would yield a runtime error, whereas "opponent = UEFA" would yield an empty result.
  • Figure 2: Overview of the Pointer-SQL model. The model encodes table columns as well as the user question with a BiLSTM and then decodes the hidden state with a typed LSTM, where the decoding action for each cell is statically determined. Source: chenglong.
  • Figure 3: Overview of the template-based baseline model. The encoder consists of an Bi-LSTM, whose outputs are then used by feed-forward networks to determine the template and fill in the slots. Source: data-sql-advising.
  • Figure 4: Example of WikiSQL and ATIS queries
  • Figure 5: Some examples where execution guidance (EG) for Coarse2Fine leads to correct prediction. In the first example, the table column is corrected by execution guidance due to an empty output. In the second example, the execution guidance corrects the sketch as all possible slot-filling options for the three-condition sketch overconstrain the program and also yield an empty output. The experiments were performed with beam size of 5.