Enhancing Task-Oriented Dialogues with Chitchat: a Comparative Study Based on Lexical Diversity and Divergence
Armand Stricker, Patrick Paroubek
TL;DR
This paper tackles the problem of repetitive responses in task-oriented dialogues (TODs) by evaluating three chitchat augmentation strategies—Accentor, KETOD, and FusedChat—against a BST chitchat reference. It employs entropy-based measures (Shannon entropy and conditional entropy) and Jensen-Shannon divergence, at both corpus and token levels, to quantify lexical diversity and lexical divergence, including a top-20 divergent-token analysis. Results show FusedChat yields the largest diversity gains, while Accentor often provides limited diversity improvements despite higher engagement, and KETOD contributes moderate gains with notable grounding in external knowledge. The study suggests that integrating task and chitchat through more situated grounding (emotions, personas, external knowledge) can produce more natural and varied TODs, guiding future dataset construction and model architectures toward richer human-like dialogue.
Abstract
As a recent development, task-oriented dialogues (TODs) have been enriched with chitchat in an effort to make dialogues more diverse and engaging. This enhancement is particularly valuable as TODs are often confined to narrow domains, making the mitigation of repetitive and predictable responses a significant challenge. This paper presents a comparative analysis of three chitchat enhancements, aiming to identify the most effective approach in terms of diversity. Additionally, we quantify the divergence between the added chitchat, the original task-oriented language, and chitchat typically found in chitchat datasets, highlighting the top 20 divergent keywords for each comparison. Our findings drive a discussion on future enhancements for augmenting TODs, emphasizing the importance of grounding dialogues beyond the task to achieve more diverse and natural exchanges.
