Sibyl: Empowering Empathetic Dialogue Generation in Large Language Models via Sensible and Visionary Commonsense Inference

Lanrui Wang; Jiangnan Li; Chenxu Yang; Zheng Lin; Hongyin Tang; Huan Liu; Yanan Cao; Jingang Wang; Weiping Wang

Sibyl: Empowering Empathetic Dialogue Generation in Large Language Models via Sensible and Visionary Commonsense Inference

Lanrui Wang, Jiangnan Li, Chenxu Yang, Zheng Lin, Hongyin Tang, Huan Liu, Yanan Cao, Jingang Wang, Weiping Wang

TL;DR

The paper tackles the challenge of generating empathetic responses in multi-turn dialogues by addressing the one-to-many nature of dialogue futures. It introduces Sibyl, a Visionary Commonsense Knowledge framework that extracts four future-oriented inferences (Cause, Subsequence Event, Emotion state, Intention) to guide responses, using GPT-4o for knowledge acquisition and finetuning open-source LLMs to predict these inferences from dialogue history. The approach is model-agnostic and evaluated on EDdata and ESConv with comprehensive automatic, human, and LLM-based assessments, showing consistent improvements over strong baselines and even beating large prompts on several metrics. By enabling lighter models to anticipate dialogue futures through symbolic-like inferences, Sibyl offers practical gains for empathetic dialogue systems on resource-constrained platforms while advancing the state of empathetic natural language generation.

Abstract

Recently, there has been a heightened interest in building chatbots based on Large Language Models (LLMs) to emulate human-like qualities in multi-turn conversations. Despite having access to commonsense knowledge to better understand the psychological aspects and causality of dialogue context, even these powerful LLMs struggle to achieve the goals of empathy and emotional support. Current commonsense knowledge derived from dialogue contexts is inherently limited and often fails to adequately anticipate the future course of a dialogue. This lack of foresight can mislead LLMs and hinder their ability to provide effective support. In response to this challenge, we present an innovative framework named Sensible and Visionary Commonsense Knowledge (Sibyl). Designed to concentrate on the immediately succeeding dialogue, this paradigm equips LLMs with the capability to uncover the implicit requirements of the conversation, aiming to elicit more empathetic responses. Experimental results demonstrate that incorporating our paradigm for acquiring commonsense knowledge into LLMs comprehensively enhances the quality of their responses.

Sibyl: Empowering Empathetic Dialogue Generation in Large Language Models via Sensible and Visionary Commonsense Inference

TL;DR

Abstract

Paper Structure (23 sections, 6 equations, 7 figures, 10 tables)

This paper contains 23 sections, 6 equations, 7 figures, 10 tables.

Introduction
Related Work
Preliminaries
Problem Formulation
Categories of Commonsense Inference
Method
Visionary Commonsense Acquisition
Sibyl Training
Sibyl Inference and Response Generation
Experimentals
Datasets
Implementation Details
Baseline Methods
Automatic Evaluation: RQ1, RQ2
Human Evaluation
...and 8 more sections

Figures (7)

Figure 1: An example from the EmpatheticDialogues dataset reveals that the commonsense inference deduced by COMET and DIALeCT demonstrates notable limitations.
Figure 2: The overview of our proposed paradigm of Commonsense Inference, Sibyl. Incorporating both dialogue history and ground truth responses, the powerful LLM first deduces four categories of visionary commonsense. These inferences serve as a guiding oracle, aiding LLaMA models in inferring from dialogue history alone during the training stage. Subsequently, these trained models function as experts in inferring four categories of commonsense knowledge.
Figure 3: Human A/B test of EmpatheticDialogues($\%$). The results are statistically significant with p-value < 0.05, and Kappa ($\kappa$) falls between 0.4 and 0.6, suggesting moderate agreement among annotators.
Figure 4: Human A/B test of ESConv ($\%$). The results are statistically significant with p-value < 0.05, and Kappa ($\kappa$) falls between 0.4 and 0.6, suggesting moderate agreement among annotators.
Figure 5: Prompt template for Visionary Commonsense acquisition.
...and 2 more figures

Sibyl: Empowering Empathetic Dialogue Generation in Large Language Models via Sensible and Visionary Commonsense Inference

TL;DR

Abstract

Sibyl: Empowering Empathetic Dialogue Generation in Large Language Models via Sensible and Visionary Commonsense Inference

Authors

TL;DR

Abstract

Table of Contents

Figures (7)