Table of Contents
Fetching ...

Sequence-to-Sequence Generation for Spoken Dialogue via Deep Syntax Trees and Strings

Ondřej Dušek, Filip Jurčíček

TL;DR

The paper investigates natural language generation for spoken dialogue systems by directly comparing two-step (sentence planning followed by surface realization) and joint (one-step) generation using a seq2seq framework that can produce either strings or deep syntax trees from dialogue acts. It demonstrates that a joint, one-step model, when augmented with beam search and a reranker, achieves higher automatic scores and more relevant outputs, even with limited training data on the BAGEL restaurant dataset. The study also shows that the generator can learn meaningful utterances with minimal data and can produce valid deep syntax trees, reducing reliance on handcrafted realizers. The authors release their code publicly and suggest future work to enhance modeling via bidirectional encoders and sequence-level training, highlighting practical benefits for data-efficient, flexible NLG in SDS.

Abstract

We present a natural language generator based on the sequence-to-sequence approach that can be trained to produce natural language strings as well as deep syntax dependency trees from input dialogue acts, and we use it to directly compare two-step generation with separate sentence planning and surface realization stages to a joint, one-step approach. We were able to train both setups successfully using very little training data. The joint setup offers better performance, surpassing state-of-the-art with regards to n-gram-based scores while providing more relevant outputs.

Sequence-to-Sequence Generation for Spoken Dialogue via Deep Syntax Trees and Strings

TL;DR

The paper investigates natural language generation for spoken dialogue systems by directly comparing two-step (sentence planning followed by surface realization) and joint (one-step) generation using a seq2seq framework that can produce either strings or deep syntax trees from dialogue acts. It demonstrates that a joint, one-step model, when augmented with beam search and a reranker, achieves higher automatic scores and more relevant outputs, even with limited training data on the BAGEL restaurant dataset. The study also shows that the generator can learn meaningful utterances with minimal data and can produce valid deep syntax trees, reducing reliance on handcrafted realizers. The authors release their code publicly and suggest future work to enhance modeling via bidirectional encoders and sequence-level training, highlighting practical benefits for data-efficient, flexible NLG in SDS.

Abstract

We present a natural language generator based on the sequence-to-sequence approach that can be trained to produce natural language strings as well as deep syntax dependency trees from input dialogue acts, and we use it to directly compare two-step generation with separate sentence planning and surface realization stages to a joint, one-step approach. We were able to train both setups successfully using very little training data. The joint setup offers better performance, surpassing state-of-the-art with regards to n-gram-based scores while providing more relevant outputs.

Paper Structure

This paper contains 10 sections, 2 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: Example DA (top) with the corresponding deep syntax tree (middle) and natural language string (bottom)
  • Figure 2: Trees encoded as sequences for the seq2seq generator (top) and the reranker (bottom)
  • Figure 3: Seq2seq generator with attention
  • Figure 4: The reranker