Table of Contents
Fetching ...

Multi-domain Neural Network Language Generation for Spoken Dialogue Systems

Tsung-Hsien Wen, Milica Gasic, Nikola Mrksic, Lina M. Rojas-Barahona, Pei-Hao Su, David Vandyke, Steve Young

TL;DR

The paper tackles open-domain spoken dialogue NLG by proposing a data-driven, multi-domain RNN-based generator trained via a two-stage adaptation: synthetic data counterfeiting from rich source domains and subsequent limited in-domain fine-tuning, augmented with discriminative training. Empirical results on BLEU and slot error rate show that counterfeiting plus DT achieve competitive performance with far less target-domain data, and human evaluations confirm quality gains in data-scarce scenarios. The approach enables cross-domain NLG transfer by sharing realizations among functionally similar slots, reducing data requirements for new domains and enhancing practical deployment of SDS. Overall, the combination of counterfeited data and discriminative objectives offers a scalable path to open-domain NLG in dialogue systems.

Abstract

Moving from limited-domain natural language generation (NLG) to open domain is difficult because the number of semantic input combinations grows exponentially with the number of domains. Therefore, it is important to leverage existing resources and exploit similarities between domains to facilitate domain adaptation. In this paper, we propose a procedure to train multi-domain, Recurrent Neural Network-based (RNN) language generators via multiple adaptation steps. In this procedure, a model is first trained on counterfeited data synthesised from an out-of-domain dataset, and then fine tuned on a small set of in-domain utterances with a discriminative objective function. Corpus-based evaluation results show that the proposed procedure can achieve competitive performance in terms of BLEU score and slot error rate while significantly reducing the data needed to train generators in new, unseen domains. In subjective testing, human judges confirm that the procedure greatly improves generator performance when only a small amount of data is available in the domain.

Multi-domain Neural Network Language Generation for Spoken Dialogue Systems

TL;DR

The paper tackles open-domain spoken dialogue NLG by proposing a data-driven, multi-domain RNN-based generator trained via a two-stage adaptation: synthetic data counterfeiting from rich source domains and subsequent limited in-domain fine-tuning, augmented with discriminative training. Empirical results on BLEU and slot error rate show that counterfeiting plus DT achieve competitive performance with far less target-domain data, and human evaluations confirm quality gains in data-scarce scenarios. The approach enables cross-domain NLG transfer by sharing realizations among functionally similar slots, reducing data requirements for new domains and enhancing practical deployment of SDS. Overall, the combination of counterfeited data and discriminative objectives offers a scalable path to open-domain NLG in dialogue systems.

Abstract

Moving from limited-domain natural language generation (NLG) to open domain is difficult because the number of semantic input combinations grows exponentially with the number of domains. Therefore, it is important to leverage existing resources and exploit similarities between domains to facilitate domain adaptation. In this paper, we propose a procedure to train multi-domain, Recurrent Neural Network-based (RNN) language generators via multiple adaptation steps. In this procedure, a model is first trained on counterfeited data synthesised from an out-of-domain dataset, and then fine tuned on a small set of in-domain utterances with a discriminative objective function. Corpus-based evaluation results show that the proposed procedure can achieve competitive performance in terms of BLEU score and slot error rate while significantly reducing the data needed to train generators in new, unseen domains. In subjective testing, human judges confirm that the procedure greatly improves generator performance when only a small amount of data is available in the domain.

Paper Structure

This paper contains 14 sections, 12 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: An example of data counterfeiting algorithm. Both slots and values are delexicalised. Slots and values that are not in the target domain are replaced during data counterfeiting (shown in red with * sign). The prefix inside bracket $<>$ indicates the slot's functional class (I for informable and R for requestable).
  • Figure 2: Results evaluated on TV domain by adapting models from laptop domain. Comparing train-from-scratch model ( scratch) with model fine-tuning approach ( tune) and data counterfeiting method ( counterfeit). $10\%\approx 700$ examples.
  • Figure 3: The same set of comparison as in Figure \ref{['fig:l2t']}, but the results were evaluated by adapting from SF restaurant and hotel joint dataset to laptop and TV joint dataset. $10\%\approx 2K$ examples.
  • Figure 4: Effect of applying DT training after ML adaptation. The results were evaluated on laptop to TV adaptation. $10\%\approx 700$ examples.