BootTOD: Bootstrap Task-oriented Dialogue Representations by Aligning Diverse Responses

Weihao Zeng; Keqing He; Yejie Wang; Dayuan Fu; Weiran Xu

BootTOD: Bootstrap Task-oriented Dialogue Representations by Aligning Diverse Responses

Weihao Zeng, Keqing He, Yejie Wang, Dayuan Fu, Weiran Xu

TL;DR

Experimental results show that BootTOD outperforms strong TOD baselines on diverse downstream dialogue tasks and uses multiple appropriate response targets to model the intrinsic one-to-many diversity of human conversations.

Abstract

Pre-trained language models have been successful in many scenarios. However, their usefulness in task-oriented dialogues is limited due to the intrinsic linguistic differences between general text and task-oriented dialogues. Current task-oriented dialogue pre-training methods rely on a contrastive framework, which faces challenges such as selecting true positives and hard negatives, as well as lacking diversity. In this paper, we propose a novel dialogue pre-training model called BootTOD. It learns task-oriented dialogue representations via a self-bootstrapping framework. Unlike contrastive counterparts, BootTOD aligns context and context+response representations and dismisses the requirements of contrastive pairs. BootTOD also uses multiple appropriate response targets to model the intrinsic one-to-many diversity of human conversations. Experimental results show that BootTOD outperforms strong TOD baselines on diverse downstream dialogue tasks.

BootTOD: Bootstrap Task-oriented Dialogue Representations by Aligning Diverse Responses

TL;DR

Abstract

Paper Structure (15 sections, 3 equations, 4 figures, 9 tables)

This paper contains 15 sections, 3 equations, 4 figures, 9 tables.

Instroduction
Model
Overall Architecture
Bootstrap Task-oriented Dialogue Representations
Experiment
Training Details
Main Results
Qualitative Analysis
Ablation Study
Hyper-parameter Analysis
Non-Contrastive Methods Comparison
Conclusion
Bibliographical References
Evaluation Details
Non-Contrastive Methods Comparison

Figures (4)

Figure 1: The same context may have multiple appropriate responses in a task-oriented dialogue.
Figure 2: Overall architecture of our proposed BootTOD.
Figure 3: Ablation study of Alignment Layers. We report the results of dialogue act prediction on DSTC2. The X-asix and Y-asix denotes the number of layers used for alignment and performance.
Figure 4: Ablation study of max future length $P$. We report the results of dialogue act prediction on DSTC2. The X-asix and Y-asix denotes the max future length $P$ and performance.

BootTOD: Bootstrap Task-oriented Dialogue Representations by Aligning Diverse Responses

TL;DR

Abstract

BootTOD: Bootstrap Task-oriented Dialogue Representations by Aligning Diverse Responses

Authors

TL;DR

Abstract

Table of Contents

Figures (4)