DailyDialog: A Manually Labelled Multi-turn Dialogue Dataset
Yanran Li, Hui Su, Xiaoyu Shen, Wenjie Li, Ziqiang Cao, Shuzi Niu
TL;DR
DailyDialog introduces a high-quality, manually labeled multi-turn dialogue corpus focused on daily-life topics. It provides explicit annotation of dialogue acts (Inform, Questions, Directives, Commissive) and seven emotion categories, enabling analysis of intention and emotion in conversation. Through extensive experiments on retrieval and generation methods, the paper demonstrates that incorporating intention and emotion cues can improve coherence and alignment with ground-truth responses, while domain-aware pretraining effects depend on dataset similarity. The dataset, with its realistic flows and rich annotations, offers a valuable resource for dialog system research, including domain adaptation and emotion-aware dialogue management.
Abstract
We develop a high-quality multi-turn dialog dataset, DailyDialog, which is intriguing in several aspects. The language is human-written and less noisy. The dialogues in the dataset reflect our daily communication way and cover various topics about our daily life. We also manually label the developed dataset with communication intention and emotion information. Then, we evaluate existing approaches on DailyDialog dataset and hope it benefit the research field of dialog systems.
