Table of Contents
Fetching ...

The Personalization Trap: How User Memory Alters Emotional Reasoning in LLMs

Xi Fang, Weijie Xu, Yuchong Zhang, Stephanie Eckman, Scott Nickleach, Chandan K. Reddy

TL;DR

The paper investigates how long-term user memory in AI assistants reshapes emotional reasoning in large language models (LLMs). By evaluating 15 models under memory and memory-free conditions, using explicit advantaged/disadvantaged profiles and intersectional demographics, it employs STEU for emotional understanding and a Modified STEM for emotion-guidance tasks, with human annotation to remove persona-sensitive items. The findings show memory-induced shifts in interpretation that privilege advantaged personas and reveal demographic biases across gender, age, and religion, which also appear in guidance tasks. The work highlights a critical personalization-fairness tension in memory-enhanced AI and calls for mitigation strategies to prevent amplification of social inequalities in high-stakes domains.

Abstract

When an AI assistant remembers that Sarah is a single mother working two jobs, does it interpret her stress differently than if she were a wealthy executive? As personalized AI systems increasingly incorporate long-term user memory, understanding how this memory shapes emotional reasoning is critical. We investigate how user memory affects emotional intelligence in large language models (LLMs) by evaluating 15 models on human validated emotional intelligence tests. We find that identical scenarios paired with different user profiles produce systematically divergent emotional interpretations. Across validated user independent emotional scenarios and diverse user profiles, systematic biases emerged in several high-performing LLMs where advantaged profiles received more accurate emotional interpretations. Moreover, LLMs demonstrate significant disparities across demographic factors in emotion understanding and supportive recommendations tasks, indicating that personalization mechanisms can embed social hierarchies into models emotional reasoning. These results highlight a key challenge for memory enhanced AI: systems designed for personalization may inadvertently reinforce social inequalities.

The Personalization Trap: How User Memory Alters Emotional Reasoning in LLMs

TL;DR

The paper investigates how long-term user memory in AI assistants reshapes emotional reasoning in large language models (LLMs). By evaluating 15 models under memory and memory-free conditions, using explicit advantaged/disadvantaged profiles and intersectional demographics, it employs STEU for emotional understanding and a Modified STEM for emotion-guidance tasks, with human annotation to remove persona-sensitive items. The findings show memory-induced shifts in interpretation that privilege advantaged personas and reveal demographic biases across gender, age, and religion, which also appear in guidance tasks. The work highlights a critical personalization-fairness tension in memory-enhanced AI and calls for mitigation strategies to prevent amplification of social inequalities in high-stakes domains.

Abstract

When an AI assistant remembers that Sarah is a single mother working two jobs, does it interpret her stress differently than if she were a wealthy executive? As personalized AI systems increasingly incorporate long-term user memory, understanding how this memory shapes emotional reasoning is critical. We investigate how user memory affects emotional intelligence in large language models (LLMs) by evaluating 15 models on human validated emotional intelligence tests. We find that identical scenarios paired with different user profiles produce systematically divergent emotional interpretations. Across validated user independent emotional scenarios and diverse user profiles, systematic biases emerged in several high-performing LLMs where advantaged profiles received more accurate emotional interpretations. Moreover, LLMs demonstrate significant disparities across demographic factors in emotion understanding and supportive recommendations tasks, indicating that personalization mechanisms can embed social hierarchies into models emotional reasoning. These results highlight a key challenge for memory enhanced AI: systems designed for personalization may inadvertently reinforce social inequalities.

Paper Structure

This paper contains 26 sections, 1 equation, 5 figures, 8 tables.

Figures (5)

  • Figure 1: An illustration demonstrating how User profiles affect AI model's Emotional comprehension.
  • Figure 2: Explicit user profile generation and emotional tasks.
  • Figure 3: Model performance varies by user demographics in both emotional understanding (top) and guidance tasks (bottom). Bars show performance differences compared to baseline users (white, non-religious, male, aged 25-34). Positive values mean better performance.
  • Figure 4: Models' emotional Reasoning impacted by User profile demonstrated by flip rate (the proportion of predictions that changed relative to the No-Memory baseline). ***: p < 0.001, **: p < 0.01, *: p < 0.05
  • Figure 5: Correlation Analysis of raw predicted outputs.