Table of Contents
Fetching ...

NL-EDIT: Correcting semantic parse errors through natural language interaction

Ahmed Elgohary, Christopher Meek, Matthew Richardson, Adam Fourney, Gonzalo Ramos, Ahmed Hassan Awadallah

TL;DR

NL-EDIT tackles correcting semantic parse errors in text-to-SQL via natural-language feedback. It introduces a formal SQL-Edit representation and an edit-based correction model that grounds feedback in the full interaction context using a relation-aware encoder and a standard decoder to output edits, rather than full queries. A synthetic data generation pipeline augments training, enabling substantial improvements over Splash baselines and across multiple parsers in zero-shot settings. The work demonstrates that one-turn NL feedback can markedly boost parsing accuracy and offers avenues for better user experience and broader applicability of the edit-based correction paradigm.

Abstract

We study semantic parsing in an interactive setting in which users correct errors with natural language feedback. We present NL-EDIT, a model for interpreting natural language feedback in the interaction context to generate a sequence of edits that can be applied to the initial parse to correct its errors. We show that NL-EDIT can boost the accuracy of existing text-to-SQL parsers by up to 20% with only one turn of correction. We analyze the limitations of the model and discuss directions for improvement and evaluation. The code and datasets used in this paper are publicly available at http://aka.ms/NLEdit.

NL-EDIT: Correcting semantic parse errors through natural language interaction

TL;DR

NL-EDIT tackles correcting semantic parse errors in text-to-SQL via natural-language feedback. It introduces a formal SQL-Edit representation and an edit-based correction model that grounds feedback in the full interaction context using a relation-aware encoder and a standard decoder to output edits, rather than full queries. A synthetic data generation pipeline augments training, enabling substantial improvements over Splash baselines and across multiple parsers in zero-shot settings. The work demonstrates that one-turn NL feedback can markedly boost parsing accuracy and offers avenues for better user experience and broader applicability of the edit-based correction paradigm.

Abstract

We study semantic parsing in an interactive setting in which users correct errors with natural language feedback. We present NL-EDIT, a model for interpreting natural language feedback in the interaction context to generate a sequence of edits that can be applied to the initial parse to correct its errors. We show that NL-EDIT can boost the accuracy of existing text-to-SQL parsers by up to 20% with only one turn of correction. We analyze the limitations of the model and discuss directions for improvement and evaluation. The code and datasets used in this paper are publicly available at http://aka.ms/NLEdit.

Paper Structure

This paper contains 18 sections, 1 equation, 5 figures, 6 tables, 1 algorithm.

Figures (5)

  • Figure 1: Example human interaction with nl-edit to correct an initial parse through natural language feedback. In the Semantic Parsing Phase (top), an off-the-shelf parser generates an initial SQL query and provides an answer paired with an explanation of the generated SQL. In the Correction Phase (bottom), the user reviews the explanation and provides feedback that describes how the explanation should be corrected. The system parses the feedback as a set of edits that are applied to the initial parse to generate a corrected SQL.
  • Figure 2: Edit for transforming the source query "SELECT id, MAX(grade) FROM assignments WHERE grade > 20 AND id NOT IN (SELECT id from graduates) GROUP BY id" to the target"SELECT id, AVG(grade) FROM assignment WHERE grade > 20 GROUP BY id ORDER BY id". The source and target are represented as sets of clauses (left and middle). The set of edits and its linearized form (Section \ref{['sec:model']}) are shown on the right. Removing the condition "id NOT IN $\text{SUBS}_1$" makes the subquery unreferenced, hence pruned from the edit.
  • Figure 3: The Encoder of nl-edit grounds the feedback into the explanation, the question, and the schema by (1) passing the concatenation of their tokens through BERT, then (2) combining self-learned and hard-coded relations in a relation-aware transformer. Three types of relations (Interaction Relations) link the individual tokens of the inputs. Question-Schema and Schema-Schema relations are not shown.
  • Figure 4: a-c: Breakdown of the correction accuracy on splash test set by (a) feedback length, (b) explanation length, and (c) size of the reference edit (number of add or remove operations). The number of examples in each group is shown on top of the bars. d: Transitions in edit size after correction. For each edit size of the initial parse (rows), we show the distribution of the edit size after correction.
  • Figure 5: Distribution of Edit Size per example in splash compared to the generalization test sets constructed based on EditSQL, TaBERT, and RAT-SQL.