Table of Contents
Fetching ...

EmoLLM: Appraisal-Grounded Cognitive-Emotional Co-Reasoning in Large Language Models

Yifei Zhang, Mingyang Li, Henry Gao, Liang Zhao

Abstract

Large language models (LLMs) demonstrate strong cognitive intelligence (IQ), yet many real-world interactions also require emotional intelligence (EQ) to produce responses that are both factually reliable and emotionally appropriate. In settings such as emotional support, technical assistance, and consultation, effective dialogue depends on how situations are appraised with respect to the user's needs, goals, and coping capacity. Inspired by appraisal theory, we propose EmoLLM, an appraisal-grounded framework for IQ/EQ co-reasoning in dialogue. EmoLLM uses an explicit Appraisal Reasoning Graph (ARG) to structure intermediate reasoning over contextual facts, inferred user needs, appraisal dimensions, emotional states, and response strategies before generating a reply. We train EmoLLM in a multi-turn role-play environment with reinforcement learning, where reverse-perspective reasoning provides reward signals based on predicted user-side consequences of responses. Across diverse dialogue settings, EmoLLM improves emotional state outcomes and response quality over strong baselines while preserving strong factual reliability.

EmoLLM: Appraisal-Grounded Cognitive-Emotional Co-Reasoning in Large Language Models

Abstract

Large language models (LLMs) demonstrate strong cognitive intelligence (IQ), yet many real-world interactions also require emotional intelligence (EQ) to produce responses that are both factually reliable and emotionally appropriate. In settings such as emotional support, technical assistance, and consultation, effective dialogue depends on how situations are appraised with respect to the user's needs, goals, and coping capacity. Inspired by appraisal theory, we propose EmoLLM, an appraisal-grounded framework for IQ/EQ co-reasoning in dialogue. EmoLLM uses an explicit Appraisal Reasoning Graph (ARG) to structure intermediate reasoning over contextual facts, inferred user needs, appraisal dimensions, emotional states, and response strategies before generating a reply. We train EmoLLM in a multi-turn role-play environment with reinforcement learning, where reverse-perspective reasoning provides reward signals based on predicted user-side consequences of responses. Across diverse dialogue settings, EmoLLM improves emotional state outcomes and response quality over strong baselines while preserving strong factual reliability.
Paper Structure (78 sections, 1 theorem, 22 equations, 6 figures, 8 tables)

This paper contains 78 sections, 1 theorem, 22 equations, 6 figures, 8 tables.

Key Result

Theorem 1

Assume a discounted MDP with bounded rewards $|r(s,a)|\le R_{\max}$ and discount factor $\gamma\in(0,1)$. Let $Q_{\pi}(s,a)$ denote the true action value under policy $\pi$, and let $Q_{\pi}^{(n)}(s,a)$ denote its $n$-step truncated return. Then, for any policy $\pi$ and any state--action pair $(s,a

Figures (6)

  • Figure 1: Why IQ--EQ co-reasoning matters. IQ-only responses can be factually relevant but emotionally insensitive, while EQ-only responses can be emotionally supportive but insufficiently grounded in the underlying situation. IQ--EQ co-reasoning enables responses that are factually grounded, emotionally attuned, and strategically appropriate.
  • Figure 2: Appraisal Reasoning Graph (ARG) in EmoLLM. At each dialogue turn, EmoLLM instantiates an ARG from the dialogue context to perform appraisal-grounded cognitive--emotional co-reasoning before generating a reply. The process repeats across turns in multi-turn interaction.
  • Figure 3: Stage II: Multi-turn RL with reverse-perspective reasoning. The policy interacts with a user simulator to generate dialogue trajectories. For each response, the model performs reverse-perspective reasoning to estimate the induced user-side transition in needs, appraisals, and emotions, optionally with $n$-step lookahead. A judge model evaluates the predicted transition to produce reverse-perspective reward signals for policy optimization.
  • Figure 4: Average Emotional Gain per Turn (EG/Turn) across four benchmarks. Higher values indicate greater positive emotional improvement during the dialogue.
  • Figure 5: Effect of reverse-perspective lookahead depth on EmpatheticDialogues. SR is shown on the left axis; ES/EA (rated on a 1--5 scale) and AT (turns; lower is better) are shown on the right axis.
  • ...and 1 more figures

Theorems & Definitions (2)

  • Theorem 1: Under the latent-state MDP abstraction, lookahead depth reduces truncation bias
  • proof