Table of Contents
Fetching ...

Diverse In-Context Example Selection After Decomposing Programs and Aligned Utterances Improves Semantic Parsing

Mayank Kothyari, Sunita Sarawagi, Soumen Chakrabarti, Gaurav Arora, Srujana Merugu

TL;DR

This paper addresses semantic parsing with LLMs by expanding the in-context demonstration pool with fragment-level decompositions of training ASTs and aligned sub-utterances. It introduces SCUD4ICL, a two-step method that (i) decomposes training instances into meaningful sub-programs and generates subsumed sub-utterances, and (ii) selects a diverse, relevant set of ICEs from the enlarged pool at test time. Empirical results on SMCalFlow, GeoQuery, and MTOP show consistent improvements in execution accuracy over strong baselines, with the largest gains for smaller models and low-resource languages, evidencing reduced in-context interference and better compositional generalization. The work provides a scalable approach to structured-sOutput tasks in in-context learning and offers practical gains for settings with private schemas or limited labeled data.

Abstract

LLMs are increasingly used as seq2seq translators from natural language utterances to structured programs, a process called semantic interpretation. Unlike atomic labels or token sequences, programs are naturally represented as abstract syntax trees (ASTs). Such structured representation raises novel issues related to the design and selection of in-context examples (ICEs) presented to the LLM. We focus on decomposing the pool of available ICE trees into fragments, some of which may be better suited to solving the test instance. Next, we propose how to use (additional invocations of) an LLM with prompted syntax constraints to automatically map the fragments to corresponding utterances. Finally, we adapt and extend a recent method for diverse ICE selection to work with whole and fragmented ICE instances. We evaluate our system, SCUD4ICL, on popular diverse semantic parsing benchmarks, showing visible accuracy gains from our proposed decomposed diverse demonstration method. Benefits are particularly notable for smaller LLMs, ICE pools having larger labeled trees, and programs in lower resource languages.

Diverse In-Context Example Selection After Decomposing Programs and Aligned Utterances Improves Semantic Parsing

TL;DR

This paper addresses semantic parsing with LLMs by expanding the in-context demonstration pool with fragment-level decompositions of training ASTs and aligned sub-utterances. It introduces SCUD4ICL, a two-step method that (i) decomposes training instances into meaningful sub-programs and generates subsumed sub-utterances, and (ii) selects a diverse, relevant set of ICEs from the enlarged pool at test time. Empirical results on SMCalFlow, GeoQuery, and MTOP show consistent improvements in execution accuracy over strong baselines, with the largest gains for smaller models and low-resource languages, evidencing reduced in-context interference and better compositional generalization. The work provides a scalable approach to structured-sOutput tasks in in-context learning and offers practical gains for settings with private schemas or limited labeled data.

Abstract

LLMs are increasingly used as seq2seq translators from natural language utterances to structured programs, a process called semantic interpretation. Unlike atomic labels or token sequences, programs are naturally represented as abstract syntax trees (ASTs). Such structured representation raises novel issues related to the design and selection of in-context examples (ICEs) presented to the LLM. We focus on decomposing the pool of available ICE trees into fragments, some of which may be better suited to solving the test instance. Next, we propose how to use (additional invocations of) an LLM with prompted syntax constraints to automatically map the fragments to corresponding utterances. Finally, we adapt and extend a recent method for diverse ICE selection to work with whole and fragmented ICE instances. We evaluate our system, SCUD4ICL, on popular diverse semantic parsing benchmarks, showing visible accuracy gains from our proposed decomposed diverse demonstration method. Benefits are particularly notable for smaller LLMs, ICE pools having larger labeled trees, and programs in lower resource languages.

Paper Structure

This paper contains 31 sections, 2 equations, 12 figures, 11 tables, 1 algorithm.

Figures (12)

  • Figure 1: An example of how decomposed queries help avoid interference. On the left are three whole ICEs selected by an existing method. On the right are SCUD4ICL's ICEs. Note that two of these are decompositions of training examples, after removing irrelevant clauses. Removing the irrelevant clauses reduces interference during ICL leading to a correct prediction from the LLM.
  • Figure 2: An example showing decomposition of a training instance by SCUD4ICL. A complex training utterance-tree pair ($\mathbf{x}_i$, $\mathbf{y}_i$) comprising of more than ten clauses is decomposed into ten subtrees of varying complexity. The sub-utterances $\mathbf{x}_{i,j}$ attached to each sub-tree $\mathbf{y}_{i,j}$ are subsumed by $\mathbf{x}_i$ while being fluent and relevant to the respective $\mathbf{y}_{i,j}$. The "Let" clause, which defines $\mathbf{x}_0$, is repeated in subqueries wherever needed, but we omit repetition in the figure to reduce clutter.
  • Figure 3: Examples showing how utterances generated by SCUD4ICL conditional on original training utterances are more fluent and natural than utterances generated when the LLM is not prompted to encourage subsumption.
  • Figure 4: Instruction to LLM for subsumed utterance decomposition in SMCalFlow. These are followed by a few decomposition ICEs. Figure \ref{['fig:qd_ICL_1']} shows a sample.
  • Figure 5: Accuracy gains of SCUD4ICL over baseline for SMCalFlow-Hi version for two different training pool sizes pointing to higher gains for a smaller pool.
  • ...and 7 more figures