Table of Contents
Fetching ...

Hello, It's GPT-2 -- How Can I Help You? Towards the Use of Pretrained Language Models for Task-Oriented Dialogue Systems

Paweł Budzianowski, Ivan Vulić

TL;DR

This work addresses data scarcity in task-oriented dialogue by leveraging large pretrained generative models (GPT-family) within the TransferTransfo framework to operate exclusively on text input, removing the need for separate policy or natural language generation modules. It adapts language modeling pretraining to dialogue tasks through fine-tuning, using text-only representations of belief states and knowledge bases, and introducing dialogue-state embeddings to handle multi-speaker context. On the MultiWOZ dataset, the approach achieves competitive results with strong baselines and offers advantages in portability and cross-domain adaptability, alongside insights into decoding strategies and human preferences. Overall, it demonstrates a data-efficient, easily adaptable direction for building engaging task-oriented agents, with future work exploring ensembles and deeper analyses of generation quality.

Abstract

Data scarcity is a long-standing and crucial challenge that hinders quick development of task-oriented dialogue systems across multiple domains: task-oriented dialogue models are expected to learn grammar, syntax, dialogue reasoning, decision making, and language generation from absurdly small amounts of task-specific data. In this paper, we demonstrate that recent progress in language modeling pre-training and transfer learning shows promise to overcome this problem. We propose a task-oriented dialogue model that operates solely on text input: it effectively bypasses explicit policy and language generation modules. Building on top of the TransferTransfo framework (Wolf et al., 2019) and generative model pre-training (Radford et al., 2019), we validate the approach on complex multi-domain task-oriented dialogues from the MultiWOZ dataset. Our automatic and human evaluations show that the proposed model is on par with a strong task-specific neural baseline. In the long run, our approach holds promise to mitigate the data scarcity problem, and to support the construction of more engaging and more eloquent task-oriented conversational agents.

Hello, It's GPT-2 -- How Can I Help You? Towards the Use of Pretrained Language Models for Task-Oriented Dialogue Systems

TL;DR

This work addresses data scarcity in task-oriented dialogue by leveraging large pretrained generative models (GPT-family) within the TransferTransfo framework to operate exclusively on text input, removing the need for separate policy or natural language generation modules. It adapts language modeling pretraining to dialogue tasks through fine-tuning, using text-only representations of belief states and knowledge bases, and introducing dialogue-state embeddings to handle multi-speaker context. On the MultiWOZ dataset, the approach achieves competitive results with strong baselines and offers advantages in portability and cross-domain adaptability, alongside insights into decoding strategies and human preferences. Overall, it demonstrates a data-efficient, easily adaptable direction for building engaging task-oriented agents, with future work exploring ensembles and deeper analyses of generation quality.

Abstract

Data scarcity is a long-standing and crucial challenge that hinders quick development of task-oriented dialogue systems across multiple domains: task-oriented dialogue models are expected to learn grammar, syntax, dialogue reasoning, decision making, and language generation from absurdly small amounts of task-specific data. In this paper, we demonstrate that recent progress in language modeling pre-training and transfer learning shows promise to overcome this problem. We propose a task-oriented dialogue model that operates solely on text input: it effectively bypasses explicit policy and language generation modules. Building on top of the TransferTransfo framework (Wolf et al., 2019) and generative model pre-training (Radford et al., 2019), we validate the approach on complex multi-domain task-oriented dialogues from the MultiWOZ dataset. Our automatic and human evaluations show that the proposed model is on par with a strong task-specific neural baseline. In the long run, our approach holds promise to mitigate the data scarcity problem, and to support the construction of more engaging and more eloquent task-oriented conversational agents.

Paper Structure

This paper contains 15 sections, 3 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Dialogue-context-to-text task.
  • Figure 2: The framework for modeling task-oriented conversations based on a pretrained GPT model which uses only unstructured simple text as input. The context, belief state, and database state are joined together without explicit standalone dialogue policy and generation modules. The token-level (i.e., dialogue-state) embeddings are learned following wolf2019transfertransfo.
  • Figure 3: The comparison of generated responses from the baseline model and GPT2-M.