Zero and Few-shot Semantic Parsing with Ambiguous Inputs
Elias Stengel-Eskin, Kyle Rawlins, Benjamin Van Durme
TL;DR
This work tackles ambiguity in semantic parsing by introducing AmP, an extensible framework and dataset that pair ambiguous natural-language utterances with two formal meanings ($LF_0$ and $LF_1$) across five ambiguity types, rendered in $FOL$ and Lisp. It evaluates large in-context learning models under zero-shot and few-shot settings using three new metrics that quantify how well models capture distributions over multiple meanings, including constrained decoding to ensure valid outputs. Key findings show that without explicit ambiguity signals, large pre-trained models often fail to represent multiple plausible meanings, but can align with ambiguity distributions when ambiguity is present in prompts or training data. The paper argues for explicit inclusion of ambiguity in datasets and evaluation protocols, releases AmP with accompanying code, and highlights the potential for interactive disambiguation to improve robustness in semantic parsing systems.
Abstract
Despite the frequent challenges posed by ambiguity when representing meaning via natural language, it is often ignored or deliberately removed in tasks mapping language to formally-designed representations, which generally assume a one-to-one mapping between linguistic and formal representations. We attempt to address this shortcoming by introducing AmP, a framework, dataset, and challenge for translating ambiguous natural language to formal representations like logic and code. We define templates and generate data for five well-documented linguistic ambiguities. Using AmP, we investigate how several few-shot text-to-code systems handle ambiguity, introducing three new metrics. We find that large pre-trained models perform poorly at capturing the distribution of possible meanings without deliberate instruction. However, models are able to capture the distribution well when ambiguity is attested in their inputs. These results motivate a call for including ambiguity explicitly in datasets and promote considering the distribution of possible outputs when evaluating systems. Data and code: https://github.com/esteng/ambiguous_parsing
