Sibyl: Empowering Empathetic Dialogue Generation in Large Language Models via Sensible and Visionary Commonsense Inference
Lanrui Wang, Jiangnan Li, Chenxu Yang, Zheng Lin, Hongyin Tang, Huan Liu, Yanan Cao, Jingang Wang, Weiping Wang
TL;DR
The paper tackles the challenge of generating empathetic responses in multi-turn dialogues by addressing the one-to-many nature of dialogue futures. It introduces Sibyl, a Visionary Commonsense Knowledge framework that extracts four future-oriented inferences (Cause, Subsequence Event, Emotion state, Intention) to guide responses, using GPT-4o for knowledge acquisition and finetuning open-source LLMs to predict these inferences from dialogue history. The approach is model-agnostic and evaluated on EDdata and ESConv with comprehensive automatic, human, and LLM-based assessments, showing consistent improvements over strong baselines and even beating large prompts on several metrics. By enabling lighter models to anticipate dialogue futures through symbolic-like inferences, Sibyl offers practical gains for empathetic dialogue systems on resource-constrained platforms while advancing the state of empathetic natural language generation.
Abstract
Recently, there has been a heightened interest in building chatbots based on Large Language Models (LLMs) to emulate human-like qualities in multi-turn conversations. Despite having access to commonsense knowledge to better understand the psychological aspects and causality of dialogue context, even these powerful LLMs struggle to achieve the goals of empathy and emotional support. Current commonsense knowledge derived from dialogue contexts is inherently limited and often fails to adequately anticipate the future course of a dialogue. This lack of foresight can mislead LLMs and hinder their ability to provide effective support. In response to this challenge, we present an innovative framework named Sensible and Visionary Commonsense Knowledge (Sibyl). Designed to concentrate on the immediately succeeding dialogue, this paradigm equips LLMs with the capability to uncover the implicit requirements of the conversation, aiming to elicit more empathetic responses. Experimental results demonstrate that incorporating our paradigm for acquiring commonsense knowledge into LLMs comprehensively enhances the quality of their responses.
