Table of Contents
Fetching ...

Who speaks like a style of Vitamin: Towards Syntax-Aware DialogueSummarization using Multi-task Learning

Seolhwa Lee, Kisu Yang, Chanjun Park, João Sedoc, Heuiseok Lim

TL;DR

This work focused on the association between utterances from individual speakers and unique syntactic structures and constructed a syntax-aware model by leveraging linguistic information (i.e., POS tagging), which alleviates the above issues by inherently distinguishing sentences uttered fromindividual speakers.

Abstract

Abstractive dialogue summarization is a challenging task for several reasons. First, most of the important pieces of information in a conversation are scattered across utterances through multi-party interactions with different textual styles. Second, dialogues are often informal structures, wherein different individuals express personal perspectives, unlike text summarization, tasks that usually target formal documents such as news articles. To address these issues, we focused on the association between utterances from individual speakers and unique syntactic structures. Speakers have unique textual styles that can contain linguistic information, such as voiceprint. Therefore, we constructed a syntax-aware model by leveraging linguistic information (i.e., POS tagging), which alleviates the above issues by inherently distinguishing sentences uttered from individual speakers. We employed multi-task learning of both syntax-aware information and dialogue summarization. To the best of our knowledge, our approach is the first method to apply multi-task learning to the dialogue summarization task. Experiments on a SAMSum corpus (a large-scale dialogue summarization corpus) demonstrated that our method improved upon the vanilla model. We further analyze the costs and benefits of our approach relative to baseline models.

Who speaks like a style of Vitamin: Towards Syntax-Aware DialogueSummarization using Multi-task Learning

TL;DR

This work focused on the association between utterances from individual speakers and unique syntactic structures and constructed a syntax-aware model by leveraging linguistic information (i.e., POS tagging), which alleviates the above issues by inherently distinguishing sentences uttered fromindividual speakers.

Abstract

Abstractive dialogue summarization is a challenging task for several reasons. First, most of the important pieces of information in a conversation are scattered across utterances through multi-party interactions with different textual styles. Second, dialogues are often informal structures, wherein different individuals express personal perspectives, unlike text summarization, tasks that usually target formal documents such as news articles. To address these issues, we focused on the association between utterances from individual speakers and unique syntactic structures. Speakers have unique textual styles that can contain linguistic information, such as voiceprint. Therefore, we constructed a syntax-aware model by leveraging linguistic information (i.e., POS tagging), which alleviates the above issues by inherently distinguishing sentences uttered from individual speakers. We employed multi-task learning of both syntax-aware information and dialogue summarization. To the best of our knowledge, our approach is the first method to apply multi-task learning to the dialogue summarization task. Experiments on a SAMSum corpus (a large-scale dialogue summarization corpus) demonstrated that our method improved upon the vanilla model. We further analyze the costs and benefits of our approach relative to baseline models.

Paper Structure

This paper contains 36 sections, 11 equations, 5 figures, 7 tables.

Figures (5)

  • Figure 1: Example utterance of formal and informal sentences of the same meaning from different speakers, with the different parts-of-speech labeled. The histogram shows the different individual textual styles.
  • Figure 2: Overview of the model architecture. The syntax-aware encoder with a task-specific linear head learns the sequence labeling task given the dialogue utterances in a bidirectional encoder setting from the BART encoder. The conversation decoder (i.e., autoregressive decoder from the BART decoder) learns the dialogue summarization task through the linear head.
  • Figure 3: Two-dimensional PCA projection of each speaker style - A (margenta), B (blue), and C (purple). The legend indicates the center point of each cluster.
  • Figure 4: Average tf-idf score on top six ranked POS features by standard deviation (std) according to the speaker styles (A, B, and C).
  • Figure 5: Data distribution of the number of utterances in the SAMSum corpus (training set).