Table of Contents
Fetching ...

A Static and Dynamic Attention Framework for Multi Turn Dialogue Generation

Wei-Nan Zhang, Yiming Cui, Kaiyan Zhang, Yifa Wang, Qingfu Zhu, Lingzhi Li, Ting Liu

TL;DR

A static and dynamic attention-based approach to model the dialogue history and then generate open domain multi turn dialogue responses to address the problem of a vanishing gradient.

Abstract

Recently, research on open domain dialogue systems have attracted extensive interests of academic and industrial researchers. The goal of an open domain dialogue system is to imitate humans in conversations. Previous works on single turn conversation generation have greatly promoted the research of open domain dialogue systems. However, understanding multiple single turn conversations is not equal to the understanding of multi turn dialogue due to the coherent and context dependent properties of human dialogue. Therefore, in open domain multi turn dialogue generation, it is essential to modeling the contextual semantics of the dialogue history, rather than only according to the last utterance. Previous research had verified the effectiveness of the hierarchical recurrent encoder-decoder framework on open domain multi turn dialogue generation. However, using RNN-based model to hierarchically encoding the utterances to obtain the representation of dialogue history still face the problem of a vanishing gradient. To address this issue, in this paper, we proposed a static and dynamic attention-based approach to model the dialogue history and then generate open domain multi turn dialogue responses. Experimental results on Ubuntu and Opensubtitles datasets verify the effectiveness of the proposed static and dynamic attention-based approach on automatic and human evaluation metrics in various experimental settings. Meanwhile, we also empirically verify the performance of combining the static and dynamic attentions on open domain multi turn dialogue generation.

A Static and Dynamic Attention Framework for Multi Turn Dialogue Generation

TL;DR

A static and dynamic attention-based approach to model the dialogue history and then generate open domain multi turn dialogue responses to address the problem of a vanishing gradient.

Abstract

Recently, research on open domain dialogue systems have attracted extensive interests of academic and industrial researchers. The goal of an open domain dialogue system is to imitate humans in conversations. Previous works on single turn conversation generation have greatly promoted the research of open domain dialogue systems. However, understanding multiple single turn conversations is not equal to the understanding of multi turn dialogue due to the coherent and context dependent properties of human dialogue. Therefore, in open domain multi turn dialogue generation, it is essential to modeling the contextual semantics of the dialogue history, rather than only according to the last utterance. Previous research had verified the effectiveness of the hierarchical recurrent encoder-decoder framework on open domain multi turn dialogue generation. However, using RNN-based model to hierarchically encoding the utterances to obtain the representation of dialogue history still face the problem of a vanishing gradient. To address this issue, in this paper, we proposed a static and dynamic attention-based approach to model the dialogue history and then generate open domain multi turn dialogue responses. Experimental results on Ubuntu and Opensubtitles datasets verify the effectiveness of the proposed static and dynamic attention-based approach on automatic and human evaluation metrics in various experimental settings. Meanwhile, we also empirically verify the performance of combining the static and dynamic attentions on open domain multi turn dialogue generation.

Paper Structure

This paper contains 14 sections, 12 equations, 6 figures, 21 tables.

Figures (6)

  • Figure 1: The encoders of three prevalent frameworks, (a) HRAN ms20171, (b) WSI ACL17-2036 and (c) ReCoSa zhang2019recosa , for the generation of multi turn dialogue responses in open domain.
  • Figure 2: The proposed attentive framework for the generation of open domain dialogues. Here, $u_*$ denotes the $*$-th utterance in a conversation. $h_*$ is the hidden state. $c_*$ and $c$ indicate the context representations that are obtained by the dynamic attention and static attention respectively.
  • Figure 3: The impact of context length on the performance of the Static and Dynamic models on Ubuntu and Opensubtitles datasets. The range of context length is from 2 to 9.
  • Figure 4: The statistics of the token frequencies of the static, dynamic attention-based models and the ReCoSa model on the vocabulary.
  • Figure 5: The collocation reservation rate of the static, dynamic attention-based models and the ReCoSa model on test set.
  • ...and 1 more figures