How Did We Get Here? Summarizing Conversation Dynamics
Yilun Hua, Nicholas Chernogor, Yuzhe Gu, Seoyeon Julie Jeong, Miranda Luo, Cristian Danescu-Niculescu-Mizil
TL;DR
The paper defines conversation dynamics summaries (SCDs) as a new task to capture trajectory and tonal evolution beyond topical content, and constructs a CGA-CMV-based dataset of human-written SCDs released via ConvoKit. It evaluates a spectrum of machine-generated SCD baselines, including prompting and finetuning, and demonstrates a downstream derailment-forecasting task where SCDs improve predictive performance and interpretability for both humans and automated systems. Human-written SCDs better convey tone changes and trajectory, while machine-generated SCDs offer strong topical context; the best automated signals come from procedural prompts, supporting a summarize-then-forecast paradigm. The work highlights practical implications for moderation and counseling contexts and provides open resources to spur further development in dynamics-focused summarization.
Abstract
Throughout a conversation, the way participants interact with each other is in constant flux: their tones may change, they may resort to different strategies to convey their points, or they might alter their interaction patterns. An understanding of these dynamics can complement that of the actual facts and opinions discussed, offering a more holistic view of the trajectory of the conversation: how it arrived at its current state and where it is likely heading. In this work, we introduce the task of summarizing the dynamics of conversations, by constructing a dataset of human-written summaries, and exploring several automated baselines. We evaluate whether such summaries can capture the trajectory of conversations via an established downstream task: forecasting whether an ongoing conversation will eventually derail into toxic behavior. We show that they help both humans and automated systems with this forecasting task. Humans make predictions three times faster, and with greater confidence, when reading the summaries than when reading the transcripts. Furthermore, automated forecasting systems are more accurate when constructing, and then predicting based on, summaries of conversation dynamics, compared to directly predicting on the transcripts.
