Table of Contents
Fetching ...

Interpersonal Memory Matters: A New Task for Proactive Dialogue Utilizing Conversational History

Bowen Wu, Wenqing Wang, Haoran Li, Ying Li, Jingsong Yu, Baoxun Wang

TL;DR

The paper introduces Memory-aware Proactive Dialogue (MapDia), a task that integrates long-term memory with proactive dialogue to steer conversations toward historically relevant topics at opportune moments. It defines an automated data-construction pipeline to create ChMapData, the first Chinese dataset for memory-aware proactive dialogue, and proposes a Retrieval-Augmented Generation (RAG) framework with three modules: Topic Summarization, Topic Retrieval, and Proactive Topic-shifting Detection and Generation. Through extensive experiments on ChMapData-test and NaturalConv-based test sets, the authors show that a RAG-based system with memory-aware components substantially outperforms end-to-end baselines in both retrieval metrics and human judgments of engagement, quality, and achievement. The work contributes a publicly released dataset and framework, advancing proactive dialogue research by incorporating memory to enhance long-term user engagement and relational continuity, with discussions on limitations and ethical considerations for real-world deployment.

Abstract

Proactive dialogue systems aim to empower chatbots with the capability of leading conversations towards specific targets, thereby enhancing user engagement and service autonomy. Existing systems typically target pre-defined keywords or entities, neglecting user attributes and preferences implicit in dialogue history, hindering the development of long-term user intimacy. To address these challenges, we take a radical step towards building a more human-like conversational agent by integrating proactive dialogue systems with long-term memory into a unified framework. Specifically, we define a novel task named Memory-aware Proactive Dialogue (MapDia). By decomposing the task, we then propose an automatic data construction method and create the first Chinese Memory-aware Proactive Dataset (ChMapData). Furthermore, we introduce a joint framework based on Retrieval Augmented Generation (RAG), featuring three modules: Topic Summarization, Topic Retrieval, and Proactive Topic-shifting Detection and Generation, designed to steer dialogues towards relevant historical topics at the right time. The effectiveness of our dataset and models is validated through both automatic and human evaluations. We release the open-source framework and dataset at https://github.com/FrontierLabs/MapDia.

Interpersonal Memory Matters: A New Task for Proactive Dialogue Utilizing Conversational History

TL;DR

The paper introduces Memory-aware Proactive Dialogue (MapDia), a task that integrates long-term memory with proactive dialogue to steer conversations toward historically relevant topics at opportune moments. It defines an automated data-construction pipeline to create ChMapData, the first Chinese dataset for memory-aware proactive dialogue, and proposes a Retrieval-Augmented Generation (RAG) framework with three modules: Topic Summarization, Topic Retrieval, and Proactive Topic-shifting Detection and Generation. Through extensive experiments on ChMapData-test and NaturalConv-based test sets, the authors show that a RAG-based system with memory-aware components substantially outperforms end-to-end baselines in both retrieval metrics and human judgments of engagement, quality, and achievement. The work contributes a publicly released dataset and framework, advancing proactive dialogue research by incorporating memory to enhance long-term user engagement and relational continuity, with discussions on limitations and ethical considerations for real-world deployment.

Abstract

Proactive dialogue systems aim to empower chatbots with the capability of leading conversations towards specific targets, thereby enhancing user engagement and service autonomy. Existing systems typically target pre-defined keywords or entities, neglecting user attributes and preferences implicit in dialogue history, hindering the development of long-term user intimacy. To address these challenges, we take a radical step towards building a more human-like conversational agent by integrating proactive dialogue systems with long-term memory into a unified framework. Specifically, we define a novel task named Memory-aware Proactive Dialogue (MapDia). By decomposing the task, we then propose an automatic data construction method and create the first Chinese Memory-aware Proactive Dataset (ChMapData). Furthermore, we introduce a joint framework based on Retrieval Augmented Generation (RAG), featuring three modules: Topic Summarization, Topic Retrieval, and Proactive Topic-shifting Detection and Generation, designed to steer dialogues towards relevant historical topics at the right time. The effectiveness of our dataset and models is validated through both automatic and human evaluations. We release the open-source framework and dataset at https://github.com/FrontierLabs/MapDia.

Paper Structure

This paper contains 27 sections, 1 equation, 9 figures, 10 tables.

Figures (9)

  • Figure 1: Comparison of previous proactive dialogue systems (Left) that extracted from gupta2022target and our system (Right) on the same sample: The left system transitions the context to a pre-designed target through a bridging path, whereas our system involving summarization, retrieval, and timing detection to generate the memory-aware response.
  • Figure 2: The pipeline of dataset construction. Not derived from the actual dataset.
  • Figure 3: An overview of our system. Left showcases an example of proactive dialogue with memory awareness. Middle outlines the pipeline, featuring a summarization model for topic extraction, a ranking model to identify relevant historical topics, and a proactive dialogue model for topic shifts and reintroducing past information at the appropriate moments. Right is a breakdown detailing how these models operate.
  • Figure 4: A sample of previous proactive dialogue system extracted from deng2023survey.
  • Figure 5: The full prompt template utilized for data construction in Section \ref{['data_construction']} with step 2 corresponding to prompt A, steps 3 corresponding to prompts B and C, and step 4 corresponding to prompt D.
  • ...and 4 more figures