Table of Contents
Fetching ...

In-Depth Analysis of Emotion Recognition through Knowledge-Based Large Language Models

Bin Han, Cleo Yau, Su Lei, Jonathan Gratch

TL;DR

This work addresses context-dependent emotion recognition by fusing facial expressions with situational knowledge via Bayesian Cue Integration. It first models emotion distributions from facial cues $P(e|f)$ using FACET, EAC, and an LSTM, then augments these with context through GPT-4 prompts and a neural fusion approach to obtain $P(e|c,f)$. The study demonstrates that context-aware fusion—especially the LSTM+GPT-4 (BCI) configuration—yields emotion distributions closely aligned with human context-based judgments, achieving performance approaching human baselines on a prisoner's dilemma task. The findings highlight the potential of knowledge-based LLMs and cue integration to advance affective computing, particularly in socially complex settings where context shapes emotion perception.

Abstract

Emotion recognition in social situations is a complex task that requires integrating information from both facial expressions and the situational context. While traditional approaches to automatic emotion recognition have focused on decontextualized signals, recent research emphasizes the importance of context in shaping emotion perceptions. This paper contributes to the emerging field of context-based emotion recognition by leveraging psychological theories of human emotion perception to inform the design of automated methods. We propose an approach that combines emotion recognition methods with Bayesian Cue Integration (BCI) to integrate emotion inferences from decontextualized facial expressions and contextual knowledge inferred via Large-language Models. We test this approach in the context of interpreting facial expressions during a social task, the prisoner's dilemma. Our results provide clear support for BCI across a range of automatic emotion recognition methods. The best automated method achieved results comparable to human observers, suggesting the potential for this approach to advance the field of affective computing.

In-Depth Analysis of Emotion Recognition through Knowledge-Based Large Language Models

TL;DR

This work addresses context-dependent emotion recognition by fusing facial expressions with situational knowledge via Bayesian Cue Integration. It first models emotion distributions from facial cues using FACET, EAC, and an LSTM, then augments these with context through GPT-4 prompts and a neural fusion approach to obtain . The study demonstrates that context-aware fusion—especially the LSTM+GPT-4 (BCI) configuration—yields emotion distributions closely aligned with human context-based judgments, achieving performance approaching human baselines on a prisoner's dilemma task. The findings highlight the potential of knowledge-based LLMs and cue integration to advance affective computing, particularly in socially complex settings where context shapes emotion perception.

Abstract

Emotion recognition in social situations is a complex task that requires integrating information from both facial expressions and the situational context. While traditional approaches to automatic emotion recognition have focused on decontextualized signals, recent research emphasizes the importance of context in shaping emotion perceptions. This paper contributes to the emerging field of context-based emotion recognition by leveraging psychological theories of human emotion perception to inform the design of automated methods. We propose an approach that combines emotion recognition methods with Bayesian Cue Integration (BCI) to integrate emotion inferences from decontextualized facial expressions and contextual knowledge inferred via Large-language Models. We test this approach in the context of interpreting facial expressions during a social task, the prisoner's dilemma. Our results provide clear support for BCI across a range of automatic emotion recognition methods. The best automated method achieved results comparable to human observers, suggesting the potential for this approach to advance the field of affective computing.
Paper Structure (12 sections, 6 figures, 5 tables)

This paper contains 12 sections, 6 figures, 5 tables.

Figures (6)

  • Figure 1: Emotional probability distribution for $P(e|f)$.
  • Figure 2: GPT Prompt for $P(e|c,f)$: Integrating facial cue and the game outcome to predict Player A's emotion.
  • Figure 3: Emotional probability distribution for $P(e|c,f)$.
  • Figure 4: Confusion matrix for facial emotion recognition. Human context-free is used as a ground truth (J stands for Joy, N for Neutral, Su for Surprise, Sa for Sadness, D for Disgust, F for Fear, and A for Anger).
  • Figure 5: Confusion matrix for facial and contextual emotion recognition. Human context-based is used as a ground truth.
  • ...and 1 more figures