PSentScore: Evaluating Sentiment Polarity in Dialogue Summarization
Yongxin Zhou, Fabien Ringeval, François Portet
TL;DR
This paper tackles the problem that dialogue summaries often omit affective content, despite its value for user experience and healthcare applications. It introduces PSent, a word-level sentiment proportion, and PSentScore, a reference-less metric that assesses how well sentiment in dialogues is preserved in summaries via correlation and error measures. By training word-level sentiment analyzers and applying a sentiment-driven data filtering strategy, the authors show that preserving affective content can be substantially improved, albeit with modest trade-offs in traditional factual metrics. The work provides a practical, reproducible framework for sentiment-aware dialogue summarization and suggests that sentiment-aligned training data can enhance affective preservation in generated summaries with meaningful real-world implications.
Abstract
Automatic dialogue summarization is a well-established task with the goal of distilling the most crucial information from human conversations into concise textual summaries. However, most existing research has predominantly focused on summarizing factual information, neglecting the affective content, which can hold valuable insights for analyzing, monitoring, or facilitating human interactions. In this paper, we introduce and assess a set of measures PSentScore, aimed at quantifying the preservation of affective content in dialogue summaries. Our findings indicate that state-of-the-art summarization models do not preserve well the affective content within their summaries. Moreover, we demonstrate that a careful selection of the training set for dialogue samples can lead to improved preservation of affective content in the generated summaries, albeit with a minor reduction in content-related metrics.
