Representing Schema Structure with Graph Neural Networks for Text-to-SQL Parsing

Ben Bogin; Matt Gardner; Jonathan Berant

Representing Schema Structure with Graph Neural Networks for Text-to-SQL Parsing

Ben Bogin, Matt Gardner, Jonathan Berant

TL;DR

The paper targets zero-shot text-to-SQL parsing by explicitly modeling DB schema structure with a Graph Neural Network and integrating this schema-aware representation into an encoder-decoder parser. The schema is converted into a graph with typed edges, and the GNN produces node representations that inform both encoding and decoding, including a self-attention mechanism over previously decoded schema items. On the Spider dataset, the approach yields a substantial improvement over prior methods (39.4% vs 33.8%), with especially large gains on multi-table queries, and reveals headroom through oracle relevance experiments. The work demonstrates that leveraging schema structure is crucial for robust, scalable text-to-SQL parsing in unseen databases.

Abstract

Research on parsing language to SQL has largely ignored the structure of the database (DB) schema, either because the DB was very simple, or because it was observed at both training and test time. In Spider, a recently-released text-to-SQL dataset, new and complex DBs are given at test time, and so the structure of the DB schema can inform the predicted SQL query. In this paper, we present an encoder-decoder semantic parser, where the structure of the DB schema is encoded with a graph neural network, and this representation is later used at both encoding and decoding time. Evaluation shows that encoding the schema structure improves our parser accuracy from 33.8% to 39.4%, dramatically above the current state of the art, which is at 19.7%.

Representing Schema Structure with Graph Neural Networks for Text-to-SQL Parsing

TL;DR

Abstract

Representing Schema Structure with Graph Neural Networks for Text-to-SQL Parsing

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (3)