EventWeave: A Dynamic Framework for Capturing Core and Supporting Events in Dialogue Systems
Zhengyi Zhao, Shubo Zhang, Yiming Du, Bin Liang, Baojun Wang, Zhongyang Li, Binyang Li, Kam-Fai Wong
TL;DR
The paper tackles the challenge that dialogue models often treat turns in isolation, neglecting underlying event structures that guide natural interactions. It introduces EventWeave, a dynamic hierarchical event-graph framework that differentiates core goals from supporting details and uses a multi-head attention mechanism to selectively retrieve relevant events for each turn. The approach defines three edge types—sequential, temporal, and reasoning—to model nuanced event relationships, along with adaptive node preservation and a multi-perspective graph-retrieval strategy to guide response generation. Experiments on Conversation Chronicle, Multi-Session Chat, and LoCoMo show that EventWeave yields more natural and contextually appropriate responses while reducing computation through graph-based representations and pruning. Together, these results suggest a scalable route to more coherent, efficient long-form dialogue systems that leverage structured event reasoning.
Abstract
Large language models have improved dialogue systems, but often process conversational turns in isolation, overlooking the event structures that guide natural interactions. Hence we introduce \textbf{EventWeave}, a framework that explicitly models relationships between conversational events to generate more contextually appropriate dialogue responses. EventWeave constructs a dynamic event graph that distinguishes between core events (main goals) and supporting events (interconnected details), employing a multi-head attention mechanism to selectively determine which events are most relevant to the current turn. Unlike summarization or standard graph-based approaches, our method captures three distinct relationship types between events, allowing for more nuanced context modeling. Experiments on three dialogue datasets demonstrate that EventWeave produces more natural and contextually appropriate responses while requiring less computational overhead than models processing the entire dialogue history. Ablation studies confirm improvements stem from better event relationship modeling rather than increased information density. Our approach effectively balances comprehensive context understanding with generating concise responses, maintaining strong performance across various dialogue lengths through targeted optimization techniques.
