Table of Contents
Fetching ...

ESCoT: Towards Interpretable Emotional Support Dialogue Systems

Tenggan Zhang, Xinjie Zhang, Jinming Zhao, Li Zhou, Qin Jin

TL;DR

This work tackles the interpretability gap in emotional support dialogue systems by introducing ESCoT, a scheme that mimics emotion identification, understanding via stimulus and appraisal, and regulation through strategy. It constructs ESD-CoT through a two-stage process: (i) ESD with situation generation and strategy enrichment, and (ii) ESD-CoT with CoT annotations expressed as the quintuple $(EM, ES, IA, SR, RE)$, followed by manual corrections. The authors build 1,708 ESD-CoT dialogues and demonstrate that fine-tuning a backbone model on this data yields responses with improved coherence, informativeness, empathy, and strategy consistency, validated by both automatic metrics and human evaluations. The dataset and code are released to promote future research into interpretable emotional support dialogue systems and CoT-based reasoning in this domain.

Abstract

Understanding the reason for emotional support response is crucial for establishing connections between users and emotional support dialogue systems. Previous works mostly focus on generating better responses but ignore interpretability, which is extremely important for constructing reliable dialogue systems. To empower the system with better interpretability, we propose an emotional support response generation scheme, named $\textbf{E}$motion-Focused and $\textbf{S}$trategy-Driven $\textbf{C}$hain-$\textbf{o}$f-$\textbf{T}$hought ($\textbf{ESCoT}$), mimicking the process of $\textit{identifying}$, $\textit{understanding}$, and $\textit{regulating}$ emotions. Specially, we construct a new dataset with ESCoT in two steps: (1) $\textit{Dialogue Generation}$ where we first generate diverse conversation situations, then enhance dialogue generation using richer emotional support strategies based on these situations; (2) $\textit{Chain Supplement}$ where we focus on supplementing selected dialogues with elements such as emotion, stimuli, appraisal, and strategy reason, forming the manually verified chains. Additionally, we further develop a model to generate dialogue responses with better interpretability. We also conduct extensive experiments and human evaluations to validate the effectiveness of the proposed ESCoT and generated dialogue responses. Our data and code are available at $\href{https://github.com/TeigenZhang/ESCoT}{https://github.com/TeigenZhang/ESCoT}$.

ESCoT: Towards Interpretable Emotional Support Dialogue Systems

TL;DR

This work tackles the interpretability gap in emotional support dialogue systems by introducing ESCoT, a scheme that mimics emotion identification, understanding via stimulus and appraisal, and regulation through strategy. It constructs ESD-CoT through a two-stage process: (i) ESD with situation generation and strategy enrichment, and (ii) ESD-CoT with CoT annotations expressed as the quintuple , followed by manual corrections. The authors build 1,708 ESD-CoT dialogues and demonstrate that fine-tuning a backbone model on this data yields responses with improved coherence, informativeness, empathy, and strategy consistency, validated by both automatic metrics and human evaluations. The dataset and code are released to promote future research into interpretable emotional support dialogue systems and CoT-based reasoning in this domain.

Abstract

Understanding the reason for emotional support response is crucial for establishing connections between users and emotional support dialogue systems. Previous works mostly focus on generating better responses but ignore interpretability, which is extremely important for constructing reliable dialogue systems. To empower the system with better interpretability, we propose an emotional support response generation scheme, named motion-Focused and trategy-Driven hain-f-hought (), mimicking the process of , , and emotions. Specially, we construct a new dataset with ESCoT in two steps: (1) where we first generate diverse conversation situations, then enhance dialogue generation using richer emotional support strategies based on these situations; (2) where we focus on supplementing selected dialogues with elements such as emotion, stimuli, appraisal, and strategy reason, forming the manually verified chains. Additionally, we further develop a model to generate dialogue responses with better interpretability. We also conduct extensive experiments and human evaluations to validate the effectiveness of the proposed ESCoT and generated dialogue responses. Our data and code are available at .
Paper Structure (41 sections, 2 equations, 12 figures, 10 tables)

This paper contains 41 sections, 2 equations, 12 figures, 10 tables.

Figures (12)

  • Figure 1: Illustration of the ESCoT scheme. The supporter first identifies emotion, then understands emotion from perspectives of emotional stimulus and individual appraisal, and finally chooses the appropriate strategy and responds to the seeker to regulate emotion.
  • Figure 2: Illustration of our data generation scheme. We construct the ESD dataset according to the left-side process, and subsequently build the ESD-CoT dataset following the quintuple of $(EM, ES, IA, SR, RE)$ in the right-side process.
  • Figure 3: Prompt used for generating new dialogues.
  • Figure 4: The topic diversity of situations.
  • Figure 5: The word cloud of each component of ESC-CoT chain annotations.
  • ...and 7 more figures