Table of Contents
Fetching ...

SS-MPC: A Sequence-Structured Multi-Party Conversation System

Yoonjin Jang, Keunha Kim, Youngjoong Ko

TL;DR

SS-MPC tackles the challenge of multi-party conversation response generation without explicit graph encoders by encoding dialogue structure as a sequence of soft prompts embedded into a Transformer encoder–decoder. It introduces MPC structure tokens (Index, Speaker, and Structure masking) and a post-training stage that teaches the encoder to predict masked structure information, enabling end-to-end generation even when some structural data is missing. Empirical results on Ubuntu IRC benchmarks show SS-MPC achieves substantial gains in BLEU-1 and ROUGE-L over state-of-the-art MPC models, with favorable human judgments for fluency, relevance, and informativeness. The work demonstrates practical benefits for real-world MPC systems by allowing simultaneous structure analysis and response generation in an end-to-end framework, while also highlighting limitations related to dataset coverage and generalization to diverse domains.

Abstract

Recent Multi-Party Conversation (MPC) models typically rely on graph-based approaches to capture dialogue structures. However, these methods have limitations, such as information loss during the projection of utterances into structural embeddings and constraints in leveraging pre-trained language models directly. In this paper, we propose \textbf{SS-MPC}, a response generation model for MPC that eliminates the need for explicit graph structures. Unlike existing models that depend on graphs to analyze conversation structures, SS-MPC internally encodes the dialogue structure as a sequential input, enabling direct utilization of pre-trained language models. Experimental results show that \textbf{SS-MPC} achieves \textbf{15.60\% BLEU-1} and \textbf{12.44\% ROUGE-L} score, outperforming the current state-of-the-art MPC response generation model by \textbf{3.91\%p} in \textbf{BLEU-1} and \textbf{0.62\%p} in \textbf{ROUGE-L}. Additionally, human evaluation confirms that SS-MPC generates more fluent and accurate responses compared to existing MPC models.

SS-MPC: A Sequence-Structured Multi-Party Conversation System

TL;DR

SS-MPC tackles the challenge of multi-party conversation response generation without explicit graph encoders by encoding dialogue structure as a sequence of soft prompts embedded into a Transformer encoder–decoder. It introduces MPC structure tokens (Index, Speaker, and Structure masking) and a post-training stage that teaches the encoder to predict masked structure information, enabling end-to-end generation even when some structural data is missing. Empirical results on Ubuntu IRC benchmarks show SS-MPC achieves substantial gains in BLEU-1 and ROUGE-L over state-of-the-art MPC models, with favorable human judgments for fluency, relevance, and informativeness. The work demonstrates practical benefits for real-world MPC systems by allowing simultaneous structure analysis and response generation in an end-to-end framework, while also highlighting limitations related to dataset coverage and generalization to diverse domains.

Abstract

Recent Multi-Party Conversation (MPC) models typically rely on graph-based approaches to capture dialogue structures. However, these methods have limitations, such as information loss during the projection of utterances into structural embeddings and constraints in leveraging pre-trained language models directly. In this paper, we propose \textbf{SS-MPC}, a response generation model for MPC that eliminates the need for explicit graph structures. Unlike existing models that depend on graphs to analyze conversation structures, SS-MPC internally encodes the dialogue structure as a sequential input, enabling direct utilization of pre-trained language models. Experimental results show that \textbf{SS-MPC} achieves \textbf{15.60\% BLEU-1} and \textbf{12.44\% ROUGE-L} score, outperforming the current state-of-the-art MPC response generation model by \textbf{3.91\%p} in \textbf{BLEU-1} and \textbf{0.62\%p} in \textbf{ROUGE-L}. Additionally, human evaluation confirms that SS-MPC generates more fluent and accurate responses compared to existing MPC models.

Paper Structure

This paper contains 30 sections, 10 equations, 3 figures, 8 tables.

Figures (3)

  • Figure 1: An example of data in MPC Dataset (Ubuntu IRC Benchmark Dataset). The dataset is constructed of context and structural information. Context consists of utterances, and structural information consists of speaker information, target-utterance relation and addressee relation of each utterance.
  • Figure 2: The overview of the SS-MPC. The encoder part is expected to analyze the dialogue and predict the structural information in dialogue. The decoder part is expected to generate the final response with using the information analyzed in encoder.
  • Figure 3: An example of the sequence-structure template for an utterance. Two index structure tokens which represents the utterance's index and the target-utterance's index, and two speaker structure tokens which represents speaker and addressee of the utterance are added as prefix to the tokenized utterance tokens.