Table of Contents
Fetching ...

Few-shot Natural Language Generation for Task-Oriented Dialog

Baolin Peng, Chenguang Zhu, Chunyuan Li, Xiujun Li, Jinchao Li, Michael Zeng, Jianfeng Gao

TL;DR

This work addresses the data-efficiency challenge of NLG in task-oriented dialogue by introducing FewShotWOZ, a few-shot benchmark spanning seven domains, and SC-GPT, a semantically conditioned GPT model that enables controllable generation grounded in dialog acts. The three-stage training pipeline—massive plain-language pre-training, dialog-act controlled pre-training, and domain-focused fine-tuning—yields strong generalization to unseen dialog acts and superior performance with limited in-domain labels, as shown on FewShotWOZ and MultiWOZ through automatic metrics and human evaluation. The paper also provides a thorough benchmark dataset critique and demonstrates that pre-training on large annotated corpora improves adaptability to new domains, with SC-GPT outperforming SC-LSTM, GPT-2, and HDSA in most settings. Overall, the approach advances practical deployment of NLG in TOD by reducing labeling costs and enabling robust, controllable, domain-general generation, with future work aiming at richer interpersonal responses and end-to-end pipeline pre-training.

Abstract

As a crucial component in task-oriented dialog systems, the Natural Language Generation (NLG) module converts a dialog act represented in a semantic form into a response in natural language. The success of traditional template-based or statistical models typically relies on heavily annotated data, which is infeasible for new domains. Therefore, it is pivotal for an NLG system to generalize well with limited labelled data in real applications. To this end, we present FewShotWoz, the first NLG benchmark to simulate the few-shot learning setting in task-oriented dialog systems. Further, we develop the SC-GPT model. It is pre-trained on a large set of annotated NLG corpus to acquire the controllable generation ability, and fine-tuned with only a few domain-specific labels to adapt to new domains. Experiments on FewShotWoz and the large Multi-Domain-WOZ datasets show that the proposed SC-GPT significantly outperforms existing methods, measured by various automatic metrics and human evaluations.

Few-shot Natural Language Generation for Task-Oriented Dialog

TL;DR

This work addresses the data-efficiency challenge of NLG in task-oriented dialogue by introducing FewShotWOZ, a few-shot benchmark spanning seven domains, and SC-GPT, a semantically conditioned GPT model that enables controllable generation grounded in dialog acts. The three-stage training pipeline—massive plain-language pre-training, dialog-act controlled pre-training, and domain-focused fine-tuning—yields strong generalization to unseen dialog acts and superior performance with limited in-domain labels, as shown on FewShotWOZ and MultiWOZ through automatic metrics and human evaluation. The paper also provides a thorough benchmark dataset critique and demonstrates that pre-training on large annotated corpora improves adaptability to new domains, with SC-GPT outperforming SC-LSTM, GPT-2, and HDSA in most settings. Overall, the approach advances practical deployment of NLG in TOD by reducing labeling costs and enabling robust, controllable, domain-general generation, with future work aiming at richer interpersonal responses and end-to-end pipeline pre-training.

Abstract

As a crucial component in task-oriented dialog systems, the Natural Language Generation (NLG) module converts a dialog act represented in a semantic form into a response in natural language. The success of traditional template-based or statistical models typically relies on heavily annotated data, which is infeasible for new domains. Therefore, it is pivotal for an NLG system to generalize well with limited labelled data in real applications. To this end, we present FewShotWoz, the first NLG benchmark to simulate the few-shot learning setting in task-oriented dialog systems. Further, we develop the SC-GPT model. It is pre-trained on a large set of annotated NLG corpus to acquire the controllable generation ability, and fine-tuned with only a few domain-specific labels to adapt to new domains. Experiments on FewShotWoz and the large Multi-Domain-WOZ datasets show that the proposed SC-GPT significantly outperforms existing methods, measured by various automatic metrics and human evaluations.

Paper Structure

This paper contains 23 sections, 4 equations, 2 figures, 10 tables.

Figures (2)

  • Figure 1: Illustration of the NLG module in the overall task-oriented dialog system. (a) The NLG module is highlighted with glowing black bounding boxes. (b) One example of dialog act (including intent and slot-value pairs) and its corresponding natural language response.
  • Figure 2: Illustration of SC-GPT. In this example, SC-GPT generates a new word token (e.g., "confirm" or "center") by attending the entire dialog act and word tokens on the left within the response.