Table of Contents
Fetching ...

SQLformer: Deep Auto-Regressive Query Graph Generation for Text-to-SQL Translation

Adrián Bazaga, Pietro Liò, Gos Micklem

TL;DR

SQLformer tackles text-to-SQL translation across unseen databases by generating SQL queries as abstract syntax trees (ASTs) in a BFS-ordered graph, guided by a schema-aware Transformer encoder. The encoder uses learnable table/column tokens to ground the NLQ in the database schema, while the autoregressive decoder conditions on node type, adjacency, and previous actions to predict grammar rules. It achieves state-of-the-art performance across six benchmarks, including Spider and context-dependent datasets, with notable improvements in EM and EX and strong zero-shot generalization. Ablation studies demonstrate the critical importance of table/column selection and grammar-aware decoding for both accuracy and efficiency.

Abstract

In recent years, the task of text-to-SQL translation, which converts natural language questions into executable SQL queries, has gained significant attention for its potential to democratize data access. Despite its promise, challenges such as adapting to unseen databases and aligning natural language with SQL syntax have hindered widespread adoption. To overcome these issues, we introduce SQLformer, a novel Transformer architecture specifically crafted to perform text-to-SQL translation tasks. Our model predicts SQL queries as abstract syntax trees (ASTs) in an autoregressive way, incorporating structural inductive bias in the encoder and decoder layers. This bias, guided by database table and column selection, aids the decoder in generating SQL query ASTs represented as graphs in a Breadth-First Search canonical order. Our experiments demonstrate that SQLformer achieves state-of-the-art performance across six prominent text-to-SQL benchmarks.

SQLformer: Deep Auto-Regressive Query Graph Generation for Text-to-SQL Translation

TL;DR

SQLformer tackles text-to-SQL translation across unseen databases by generating SQL queries as abstract syntax trees (ASTs) in a BFS-ordered graph, guided by a schema-aware Transformer encoder. The encoder uses learnable table/column tokens to ground the NLQ in the database schema, while the autoregressive decoder conditions on node type, adjacency, and previous actions to predict grammar rules. It achieves state-of-the-art performance across six benchmarks, including Spider and context-dependent datasets, with notable improvements in EM and EX and strong zero-shot generalization. Ablation studies demonstrate the critical importance of table/column selection and grammar-aware decoding for both accuracy and efficiency.

Abstract

In recent years, the task of text-to-SQL translation, which converts natural language questions into executable SQL queries, has gained significant attention for its potential to democratize data access. Despite its promise, challenges such as adapting to unseen databases and aligning natural language with SQL syntax have hindered widespread adoption. To overcome these issues, we introduce SQLformer, a novel Transformer architecture specifically crafted to perform text-to-SQL translation tasks. Our model predicts SQL queries as abstract syntax trees (ASTs) in an autoregressive way, incorporating structural inductive bias in the encoder and decoder layers. This bias, guided by database table and column selection, aids the decoder in generating SQL query ASTs represented as graphs in a Breadth-First Search canonical order. Our experiments demonstrate that SQLformer achieves state-of-the-art performance across six prominent text-to-SQL benchmarks.
Paper Structure (30 sections, 11 equations, 4 figures, 11 tables)

This paper contains 30 sections, 11 equations, 4 figures, 11 tables.

Figures (4)

  • Figure 1: An illustration of SQLformer: our model inherits the seq2seq nature of the Transformer architecture, consisting of $L$ layers of encoders and decoders. SQLformer encoder introduces database table and column selection as inductive biases to contextualize the embedding of a question. In this example, the question consists of six tokens (Fig. \ref{['fig:figure2']}). This schema-conditioned question representation serves as input to the SQLformer decoder module. Here we show the decoding timestep $t$ = 4 as an example. The architecture for the decoder module is detailed in Fig. \ref{['fig:app_decoder_arch']}.
  • Figure 2: Illustration of an example Spider question with six tokens as a graph $\mathcal{G}$ with part-of-speech and dependency relations. In this example, the token $number$ has a OBJECT dependency with $Find$, and $Find$ and $number$ are tagged as verb (VB) and noun (NN), respectively.
  • Figure 3: An illustration of an example Spider schema for database $scientist\_1$. In this example, there are a total of 3 tables ($scientists$, $projects$, $assigned\_to$), with multiple columns for each table and relationships between the tables.
  • Figure 4: Overview of the SQLformer decoder architecture.