N-Best Hypotheses Reranking for Text-To-SQL Systems

Lu Zeng; Sree Hari Krishnan Parthasarathi; Dilek Hakkani-Tur

N-Best Hypotheses Reranking for Text-To-SQL Systems

Lu Zeng, Sree Hari Krishnan Parthasarathi, Dilek Hakkani-Tur

TL;DR

This paper investigates reranking of n-best Text-to-SQL hypotheses produced by a state-of-the-art system on the Spider dataset. It introduces two reranking strategies—query-plan–based coherence modeling and schema-linking–based correctness—each addressing distinct error modes in large LM outputs. Oracle analyses show substantial potential gains, and the combined approaches yield a consistent 1% increase in EM and 2.5% in EX, establishing new strong baselines. Error analysis reveals that evaluation metrics and annotation quality significantly constrain progress, underscoring the need for more robust evaluation in this domain.

Abstract

Text-to-SQL task maps natural language utterances to structured queries that can be issued to a database. State-of-the-art (SOTA) systems rely on finetuning large, pre-trained language models in conjunction with constrained decoding applying a SQL parser. On the well established Spider dataset, we begin with Oracle studies: specifically, choosing an Oracle hypothesis from a SOTA model's 10-best list, yields a $7.7\%$ absolute improvement in both exact match (EM) and execution (EX) accuracy, showing significant potential improvements with reranking. Identifying coherence and correctness as reranking approaches, we design a model generating a query plan and propose a heuristic schema linking algorithm. Combining both approaches, with T5-Large, we obtain a consistent $1\% $ improvement in EM accuracy, and a $~2.5\%$ improvement in EX, establishing a new SOTA for this task. Our comprehensive error studies on DEV data show the underlying difficulty in making progress on this task.

N-Best Hypotheses Reranking for Text-To-SQL Systems

TL;DR

Abstract

absolute improvement in both exact match (EM) and execution (EX) accuracy, showing significant potential improvements with reranking. Identifying coherence and correctness as reranking approaches, we design a model generating a query plan and propose a heuristic schema linking algorithm. Combining both approaches, with T5-Large, we obtain a consistent

improvement in EM accuracy, and a

improvement in EX, establishing a new SOTA for this task. Our comprehensive error studies on DEV data show the underlying difficulty in making progress on this task.

Paper Structure (22 sections, 7 figures, 9 tables)

This paper contains 22 sections, 7 figures, 9 tables.

Introduction
Related Work
Text-to-SQL Using Pre-trained LMs
Reranking Approaches
Improving Coherence with Query Plan
Improving Correctness with Schema Linking
Experiments
Dataset and Metrics
Dataset
Metrics
Models
Oracle Analysis
Results
Improving Coherence with Query Plan Modeling
Improving Correctness with Schema Linking
...and 7 more sections

Figures (7)

Figure 1: PICARD explained by an example. The prediction pattern is "〈Database_name〉 | 〈pred_SQL〉".
Figure :
Figure :
Figure :
Figure :
...and 2 more figures

N-Best Hypotheses Reranking for Text-To-SQL Systems

TL;DR

Abstract

N-Best Hypotheses Reranking for Text-To-SQL Systems

Authors

TL;DR

Abstract

Table of Contents

Figures (7)