Talk, Listen, Connect: How Humans and AI Evaluate Empathy in Responses to Emotionally Charged Narratives
Mahnaz Roshanaei, Rezvaneh Rezapour, Magy Seif El-Nasr
TL;DR
This study investigates how AI-generated responses and humans perceive and evoke empathy when reacting to emotionally charged personal narratives. By collecting human empathy ratings via MTurk and generating parallel AI responses with GPT-4o—varying prompts through persona attributes and targeted fine-tuning—the authors quantify alignment using multivariate statistics and distributional comparisons. Key findings show that GPT-4o tends to overestimate empathy, particularly cognitive empathy, but alignment with human judgments improves substantially when fine-tuning includes story content and reader-level attributes, especially perceived experience similarity. The work highlights important design and ethical considerations for empathetic AI, including the need to balance persuasive empathic responses with authenticity, avoid over-reliance, and ensure equitable empathy across diverse users and contexts.
Abstract
Social interactions promote well-being, yet barriers like geographic distance, time limitations, and mental health conditions can limit face-to-face interactions. Emotionally responsive AI systems, such as chatbots, offer new opportunities for social and emotional support, but raise critical questions about how empathy is perceived and experienced in human-AI interactions. This study examines how empathy is evaluated in AI-generated versus human responses. Using personal narratives, we explored how persona attributes (e.g., gender, empathic traits, shared experiences) and story qualities affect empathy ratings. We compared responses from standard and fine-tuned AI models with human judgments. Results show that while humans are highly sensitive to emotional vividness and shared experience, AI-responses are less influenced by these cues, often lack nuance in empathic expression. These findings highlight challenges in designing emotionally intelligent systems that respond meaningfully across diverse users and contexts, and informs the design of ethically aware tools to support social connection and well-being.
