Compositional Generalization and Natural Language Variation: Can a Semantic Parsing Approach Handle Both?

Peter Shaw; Ming-Wei Chang; Panupong Pasupat; Kristina Toutanova

Compositional Generalization and Natural Language Variation: Can a Semantic Parsing Approach Handle Both?

Peter Shaw, Ming-Wei Chang, Panupong Pasupat, Kristina Toutanova

TL;DR

The paper investigates whether semantic parsing can simultaneously handle natural language variation and out-of-distribution compositional generalization. It introduces TMCD splits to evaluate non-synthetic data and proposes NQG-T5, a hybrid model that combines a high-precision grammar-based NQG with a pre-trained seq2seq model (T5). Empirical results show NQG-T5 achieves top-average performance across SCAN, GeoQuery, and Spider splits, highlighting the value of diverse benchmarks while underscoring the ongoing challenge of achieving robust generalization in semantic parsing. The work emphasizes that future progress will require integrating diverse evaluation regimes with architectures that blend symbolic precision and neural generalization capabilities.

Abstract

Sequence-to-sequence models excel at handling natural language variation, but have been shown to struggle with out-of-distribution compositional generalization. This has motivated new specialized architectures with stronger compositional biases, but most of these approaches have only been evaluated on synthetically-generated datasets, which are not representative of natural language variation. In this work we ask: can we develop a semantic parsing approach that handles both natural language variation and compositional generalization? To better assess this capability, we propose new train and test splits of non-synthetic datasets. We demonstrate that strong existing approaches do not perform well across a broad set of evaluations. We also propose NQG-T5, a hybrid model that combines a high-precision grammar-based approach with a pre-trained sequence-to-sequence model. It outperforms existing approaches across several compositional generalization challenges on non-synthetic data, while also being competitive with the state-of-the-art on standard evaluations. While still far from solving this problem, our study highlights the importance of diverse evaluations and the open challenge of handling both compositional generalization and natural language variation in semantic parsing.

Compositional Generalization and Natural Language Variation: Can a Semantic Parsing Approach Handle Both?

TL;DR

Abstract

Compositional Generalization and Natural Language Variation: Can a Semantic Parsing Approach Handle Both?

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (4)