Hello, It's GPT-2 -- How Can I Help You? Towards the Use of Pretrained Language Models for Task-Oriented Dialogue Systems
Paweł Budzianowski, Ivan Vulić
TL;DR
This work addresses data scarcity in task-oriented dialogue by leveraging large pretrained generative models (GPT-family) within the TransferTransfo framework to operate exclusively on text input, removing the need for separate policy or natural language generation modules. It adapts language modeling pretraining to dialogue tasks through fine-tuning, using text-only representations of belief states and knowledge bases, and introducing dialogue-state embeddings to handle multi-speaker context. On the MultiWOZ dataset, the approach achieves competitive results with strong baselines and offers advantages in portability and cross-domain adaptability, alongside insights into decoding strategies and human preferences. Overall, it demonstrates a data-efficient, easily adaptable direction for building engaging task-oriented agents, with future work exploring ensembles and deeper analyses of generation quality.
Abstract
Data scarcity is a long-standing and crucial challenge that hinders quick development of task-oriented dialogue systems across multiple domains: task-oriented dialogue models are expected to learn grammar, syntax, dialogue reasoning, decision making, and language generation from absurdly small amounts of task-specific data. In this paper, we demonstrate that recent progress in language modeling pre-training and transfer learning shows promise to overcome this problem. We propose a task-oriented dialogue model that operates solely on text input: it effectively bypasses explicit policy and language generation modules. Building on top of the TransferTransfo framework (Wolf et al., 2019) and generative model pre-training (Radford et al., 2019), we validate the approach on complex multi-domain task-oriented dialogues from the MultiWOZ dataset. Our automatic and human evaluations show that the proposed model is on par with a strong task-specific neural baseline. In the long run, our approach holds promise to mitigate the data scarcity problem, and to support the construction of more engaging and more eloquent task-oriented conversational agents.
