Table of Contents
Fetching ...

Bridging Textual and Tabular Data for Cross-Domain Text-to-SQL Semantic Parsing

Xi Victoria Lin, Richard Socher, Caiming Xiong

TL;DR

BRIDGE tackles cross-DB text-to-SQL parsing by serializing the NL question and the relational DB schema into a tagged sequence and encoding it with BERT, augmented by anchor-text bridging that links question content to field values. The model uses a lightweight, single-layer LSTM decoder with pointer-copy capabilities and schema-consistency driven pruning to generate SQL in execution order, achieving state-of-the-art or near state-of-the-art results on Spider and WikiSQL, including strong ensemble performance on Spider. Ablation and error analyses show bridging and encoding choices substantially improve results, though the approach struggles with compositional generalization and explainability. The work suggests BRIDGE’s approach to joint textual-tabular understanding can generalize to related tasks but invites further work on compositionality, interpretability, and broader DB-content integration.

Abstract

We present BRIDGE, a powerful sequential architecture for modeling dependencies between natural language questions and relational databases in cross-DB semantic parsing. BRIDGE represents the question and DB schema in a tagged sequence where a subset of the fields are augmented with cell values mentioned in the question. The hybrid sequence is encoded by BERT with minimal subsequent layers and the text-DB contextualization is realized via the fine-tuned deep attention in BERT. Combined with a pointer-generator decoder with schema-consistency driven search space pruning, BRIDGE attained state-of-the-art performance on popular cross-DB text-to-SQL benchmarks, Spider (71.1\% dev, 67.5\% test with ensemble model) and WikiSQL (92.6\% dev, 91.9\% test). Our analysis shows that BRIDGE effectively captures the desired cross-modal dependencies and has the potential to generalize to more text-DB related tasks. Our implementation is available at \url{https://github.com/salesforce/TabularSemanticParsing}.

Bridging Textual and Tabular Data for Cross-Domain Text-to-SQL Semantic Parsing

TL;DR

BRIDGE tackles cross-DB text-to-SQL parsing by serializing the NL question and the relational DB schema into a tagged sequence and encoding it with BERT, augmented by anchor-text bridging that links question content to field values. The model uses a lightweight, single-layer LSTM decoder with pointer-copy capabilities and schema-consistency driven pruning to generate SQL in execution order, achieving state-of-the-art or near state-of-the-art results on Spider and WikiSQL, including strong ensemble performance on Spider. Ablation and error analyses show bridging and encoding choices substantially improve results, though the approach struggles with compositional generalization and explainability. The work suggests BRIDGE’s approach to joint textual-tabular understanding can generalize to related tasks but invites further work on compositionality, interpretability, and broader DB-content integration.

Abstract

We present BRIDGE, a powerful sequential architecture for modeling dependencies between natural language questions and relational databases in cross-DB semantic parsing. BRIDGE represents the question and DB schema in a tagged sequence where a subset of the fields are augmented with cell values mentioned in the question. The hybrid sequence is encoded by BERT with minimal subsequent layers and the text-DB contextualization is realized via the fine-tuned deep attention in BERT. Combined with a pointer-generator decoder with schema-consistency driven search space pruning, BRIDGE attained state-of-the-art performance on popular cross-DB text-to-SQL benchmarks, Spider (71.1\% dev, 67.5\% test with ensemble model) and WikiSQL (92.6\% dev, 91.9\% test). Our analysis shows that BRIDGE effectively captures the desired cross-modal dependencies and has the potential to generalize to more text-DB related tasks. Our implementation is available at \url{https://github.com/salesforce/TabularSemanticParsing}.

Paper Structure

This paper contains 53 sections, 2 theorems, 9 equations, 11 figures, 12 tables.

Key Result

Lemma 1

Let $Y_{\text{exec}}$ be a SQL query with clauses arranged in execution order, then any table field in $Y_{\text{exec}}$ must appear after the table.

Figures (11)

  • Figure 1: Two questions from the Spider dataset with similar intent resulted in completely different SQL logical forms on two DBs. In cross-DB text-to-SQL semantic parsing, the interpretation of a natural language question is strictly grounded in the underlying relational DB schema.
  • Figure 2: The BRIDGE encoder. The two phrases "houses" and "apartments" in the input question both matched to two DB fields. The matched values are appended to the corresponding field names in the hybrid sequence.
  • Figure 3: Distribution of # non-numeric values in the ground truth SQL queries in the depth .9Spider and depth .9WikiSQL dev sets.
  • Figure 4: BRIDGE error type distribution (Spider dev).
  • Figure A1: Performance ensemble models w.r.t. different # models in the ensemble.
  • ...and 6 more figures

Theorems & Definitions (2)

  • Lemma 1
  • Lemma 2