Table of Contents
Fetching ...

Representing Schema Structure with Graph Neural Networks for Text-to-SQL Parsing

Ben Bogin, Matt Gardner, Jonathan Berant

TL;DR

The paper targets zero-shot text-to-SQL parsing by explicitly modeling DB schema structure with a Graph Neural Network and integrating this schema-aware representation into an encoder-decoder parser. The schema is converted into a graph with typed edges, and the GNN produces node representations that inform both encoding and decoding, including a self-attention mechanism over previously decoded schema items. On the Spider dataset, the approach yields a substantial improvement over prior methods (39.4% vs 33.8%), with especially large gains on multi-table queries, and reveals headroom through oracle relevance experiments. The work demonstrates that leveraging schema structure is crucial for robust, scalable text-to-SQL parsing in unseen databases.

Abstract

Research on parsing language to SQL has largely ignored the structure of the database (DB) schema, either because the DB was very simple, or because it was observed at both training and test time. In Spider, a recently-released text-to-SQL dataset, new and complex DBs are given at test time, and so the structure of the DB schema can inform the predicted SQL query. In this paper, we present an encoder-decoder semantic parser, where the structure of the DB schema is encoded with a graph neural network, and this representation is later used at both encoding and decoding time. Evaluation shows that encoding the schema structure improves our parser accuracy from 33.8% to 39.4%, dramatically above the current state of the art, which is at 19.7%.

Representing Schema Structure with Graph Neural Networks for Text-to-SQL Parsing

TL;DR

The paper targets zero-shot text-to-SQL parsing by explicitly modeling DB schema structure with a Graph Neural Network and integrating this schema-aware representation into an encoder-decoder parser. The schema is converted into a graph with typed edges, and the GNN produces node representations that inform both encoding and decoding, including a self-attention mechanism over previously decoded schema items. On the Spider dataset, the approach yields a substantial improvement over prior methods (39.4% vs 33.8%), with especially large gains on multi-table queries, and reveals headroom through oracle relevance experiments. The work demonstrates that leveraging schema structure is crucial for robust, scalable text-to-SQL parsing in unseen databases.

Abstract

Research on parsing language to SQL has largely ignored the structure of the database (DB) schema, either because the DB was very simple, or because it was observed at both training and test time. In Spider, a recently-released text-to-SQL dataset, new and complex DBs are given at test time, and so the structure of the DB schema can inform the predicted SQL query. In this paper, we present an encoder-decoder semantic parser, where the structure of the DB schema is encoded with a graph neural network, and this representation is later used at both encoding and decoding time. Evaluation shows that encoding the schema structure improves our parser accuracy from 33.8% to 39.4%, dramatically above the current state of the art, which is at 19.7%.

Paper Structure

This paper contains 17 sections, 5 equations, 3 figures, 2 tables.

Figures (3)

  • Figure 1: Examples from Spider showing how similar questions can have different SQL queries, conditioned on the schema. Table names are underlined.
  • Figure 2: The decoder we base our work on krishnamurthy2017neural. The input to the LSTM ($g_j$) at step $j$ is a learned embedding of the last decoded grammar rule, except when the last rule is schema-specific ($g_3$), where the input is a learned embedding of the schema item type. A grammar rule is selected based on the LSTM output ($o_j$) and the attended hidden state of the input LSTM ($c_j$).
  • Figure 3: Left: DB schema and question. Middle: A graph representation of the schema. Bold nodes are tables, other nodes are columns. Dashed red (blue) edges are foreign (primary) keys edges, green edges are table-column edges. Right: Use of the schema by the decoder. For clarity, the decoder outputs tokens rather than grammar rules.