Interpersonal Memory Matters: A New Task for Proactive Dialogue Utilizing Conversational History
Bowen Wu, Wenqing Wang, Haoran Li, Ying Li, Jingsong Yu, Baoxun Wang
TL;DR
The paper introduces Memory-aware Proactive Dialogue (MapDia), a task that integrates long-term memory with proactive dialogue to steer conversations toward historically relevant topics at opportune moments. It defines an automated data-construction pipeline to create ChMapData, the first Chinese dataset for memory-aware proactive dialogue, and proposes a Retrieval-Augmented Generation (RAG) framework with three modules: Topic Summarization, Topic Retrieval, and Proactive Topic-shifting Detection and Generation. Through extensive experiments on ChMapData-test and NaturalConv-based test sets, the authors show that a RAG-based system with memory-aware components substantially outperforms end-to-end baselines in both retrieval metrics and human judgments of engagement, quality, and achievement. The work contributes a publicly released dataset and framework, advancing proactive dialogue research by incorporating memory to enhance long-term user engagement and relational continuity, with discussions on limitations and ethical considerations for real-world deployment.
Abstract
Proactive dialogue systems aim to empower chatbots with the capability of leading conversations towards specific targets, thereby enhancing user engagement and service autonomy. Existing systems typically target pre-defined keywords or entities, neglecting user attributes and preferences implicit in dialogue history, hindering the development of long-term user intimacy. To address these challenges, we take a radical step towards building a more human-like conversational agent by integrating proactive dialogue systems with long-term memory into a unified framework. Specifically, we define a novel task named Memory-aware Proactive Dialogue (MapDia). By decomposing the task, we then propose an automatic data construction method and create the first Chinese Memory-aware Proactive Dataset (ChMapData). Furthermore, we introduce a joint framework based on Retrieval Augmented Generation (RAG), featuring three modules: Topic Summarization, Topic Retrieval, and Proactive Topic-shifting Detection and Generation, designed to steer dialogues towards relevant historical topics at the right time. The effectiveness of our dataset and models is validated through both automatic and human evaluations. We release the open-source framework and dataset at https://github.com/FrontierLabs/MapDia.
