Table of Contents
Fetching ...

Shift-Reduce Task-Oriented Semantic Parsing with Stack-Transformers

Daniel Fernández-González

TL;DR

This paper tackles task-oriented semantic parsing by proposing shift-reduce parsers built on Stack-Transformers to produce well-formed TOP trees. It introduces three transition systems—top-down, bottom-up, and in-order—adapted from constituency parsing and a neural model that uses frozen RoBERTa-based embeddings with a stack/buffer–aware decoder. Across high-resource and low-resource settings on the Facebook TOP benchmarks, the in-order transition system consistently yields the best accuracy, often surpassing strong sequence-to-sequence baselines while guaranteeing valid TOP structures. The work demonstrates that structured shift-reduce parsing with Stack-Transformers can achieve state-of-the-art results and offers avenues for further gains via ensembling or hierarchical information integration.

Abstract

Intelligent voice assistants, such as Apple Siri and Amazon Alexa, are widely used nowadays. These task-oriented dialogue systems require a semantic parsing module in order to process user utterances and understand the action to be performed. This semantic parsing component was initially implemented by rule-based or statistical slot-filling approaches for processing simple queries; however, the appearance of more complex utterances demanded the application of shift-reduce parsers or sequence-to-sequence models. Although shift-reduce approaches were initially considered the most promising option, the emergence of sequence-to-sequence neural systems has propelled them to the forefront as the highest-performing method for this particular task. In this article, we advance the research on shift-reduce semantic parsing for task-oriented dialogue. We implement novel shift-reduce parsers that rely on Stack-Transformers. This framework allows to adequately model transition systems on the Transformer neural architecture, notably boosting shift-reduce parsing performance. Furthermore, our approach goes beyond the conventional top-down algorithm: we incorporate alternative bottom-up and in-order transition systems derived from constituency parsing into the realm of task-oriented parsing. We extensively test our approach on multiple domains from the Facebook TOP benchmark, improving over existing shift-reduce parsers and state-of-the-art sequence-to-sequence models in both high-resource and low-resource settings. We also empirically prove that the in-order algorithm substantially outperforms the commonly-used top-down strategy. Through the creation of innovative transition systems and harnessing the capabilities of a robust neural architecture, our study showcases the superiority of shift-reduce parsers over leading sequence-to-sequence methods on the main benchmark.

Shift-Reduce Task-Oriented Semantic Parsing with Stack-Transformers

TL;DR

This paper tackles task-oriented semantic parsing by proposing shift-reduce parsers built on Stack-Transformers to produce well-formed TOP trees. It introduces three transition systems—top-down, bottom-up, and in-order—adapted from constituency parsing and a neural model that uses frozen RoBERTa-based embeddings with a stack/buffer–aware decoder. Across high-resource and low-resource settings on the Facebook TOP benchmarks, the in-order transition system consistently yields the best accuracy, often surpassing strong sequence-to-sequence baselines while guaranteeing valid TOP structures. The work demonstrates that structured shift-reduce parsing with Stack-Transformers can achieve state-of-the-art results and offers avenues for further gains via ensembling or hierarchical information integration.

Abstract

Intelligent voice assistants, such as Apple Siri and Amazon Alexa, are widely used nowadays. These task-oriented dialogue systems require a semantic parsing module in order to process user utterances and understand the action to be performed. This semantic parsing component was initially implemented by rule-based or statistical slot-filling approaches for processing simple queries; however, the appearance of more complex utterances demanded the application of shift-reduce parsers or sequence-to-sequence models. Although shift-reduce approaches were initially considered the most promising option, the emergence of sequence-to-sequence neural systems has propelled them to the forefront as the highest-performing method for this particular task. In this article, we advance the research on shift-reduce semantic parsing for task-oriented dialogue. We implement novel shift-reduce parsers that rely on Stack-Transformers. This framework allows to adequately model transition systems on the Transformer neural architecture, notably boosting shift-reduce parsing performance. Furthermore, our approach goes beyond the conventional top-down algorithm: we incorporate alternative bottom-up and in-order transition systems derived from constituency parsing into the realm of task-oriented parsing. We extensively test our approach on multiple domains from the Facebook TOP benchmark, improving over existing shift-reduce parsers and state-of-the-art sequence-to-sequence models in both high-resource and low-resource settings. We also empirically prove that the in-order algorithm substantially outperforms the commonly-used top-down strategy. Through the creation of innovative transition systems and harnessing the capabilities of a robust neural architecture, our study showcases the superiority of shift-reduce parsers over leading sequence-to-sequence methods on the main benchmark.
Paper Structure (26 sections, 3 equations, 4 figures, 7 tables)

This paper contains 26 sections, 3 equations, 4 figures, 7 tables.

Figures (4)

  • Figure 1: Flat and compositional TOP annotations of utterances from music and navigation domains, respectively. Note that intents and slots are respectively prefixed with IN: and SL:.
  • Figure 2: Transformer neural architecture introduced by Vaswani2017. Note that this neural network requires the incorporation of positional encoding for each input token to maintain sequential order, and Layer Norm refers to the layer normalization technique proposed by layernormalization.
  • Figure 3: Updates to the masks $m^{\textit{stack}}_t$ and $m^{\textit{buffer}}_t$ that reflect the effects of certain in-order transitions on the stack and buffer during the shift-reduce parsing process illustrated in Table \ref{['fig:example3']}.
  • Figure 4: Performance comparison of the three transition systems relative to utterance length and structural factors.