Table of Contents
Fetching ...

Rank-O-ToM: Unlocking Emotional Nuance Ranking to Enhance Affective Theory-of-Mind

JiHyun Kim, JuneHyoung Kwon, MiHyeon Kim, Eunju Lee, YoungBin Kim

TL;DR

Rank-O-ToM addresses the challenge of calibrating AI models to interpret nuanced emotional states for affective ToM. It introduces synthetic sample blending via an adapted horizontal CutMix and a ranking-based loss that enforces higher confidence on clearer emotions and lower confidence on blended ones, formalized as $\mathcal{L}_{rank} = \max(0, \max_{c_1} p^{\mathrm{syn}}_{c_1} - \max_{c_1} p^{\mathrm{fer}}_{c_1} + \delta) + \max(0, \max_{c_2} p^{\mathrm{syn}}_{c_2} - \max_{c_2} p^{\mathrm{fr}}_{c_2} + \delta)$. The model also leverages pseudo-labeling of unlabeled FR data with adaptive class-wise thresholds to broaden training diversity. Empirical results on RAF-DB, FERPlus, and AffectNet show improved accuracy and confidence calibration, with qualitative CAM analyses indicating more comprehensive facial region attention and better alignment with compound emotions. Overall, Rank-O-ToM enhances affective ToM by capturing emotional intensity and complexity, enabling more nuanced and trustworthy emotion-aware AI interactions.

Abstract

Facial Expression Recognition (FER) plays a foundational role in enabling AI systems to interpret emotional nuances, a critical aspect of affective Theory of Mind (ToM). However, existing models often struggle with poor calibration and a limited capacity to capture emotional intensity and complexity. To address this, we propose Ranking the Emotional Nuance for Theory of Mind (Rank-O-ToM), a framework that leverages ordinal ranking to align confidence levels with the emotional spectrum. By incorporating synthetic samples reflecting diverse affective complexities, Rank-O-ToM enhances the nuanced understanding of emotions, advancing AI's ability to reason about affective states.

Rank-O-ToM: Unlocking Emotional Nuance Ranking to Enhance Affective Theory-of-Mind

TL;DR

Rank-O-ToM addresses the challenge of calibrating AI models to interpret nuanced emotional states for affective ToM. It introduces synthetic sample blending via an adapted horizontal CutMix and a ranking-based loss that enforces higher confidence on clearer emotions and lower confidence on blended ones, formalized as . The model also leverages pseudo-labeling of unlabeled FR data with adaptive class-wise thresholds to broaden training diversity. Empirical results on RAF-DB, FERPlus, and AffectNet show improved accuracy and confidence calibration, with qualitative CAM analyses indicating more comprehensive facial region attention and better alignment with compound emotions. Overall, Rank-O-ToM enhances affective ToM by capturing emotional intensity and complexity, enabling more nuanced and trustworthy emotion-aware AI interactions.

Abstract

Facial Expression Recognition (FER) plays a foundational role in enabling AI systems to interpret emotional nuances, a critical aspect of affective Theory of Mind (ToM). However, existing models often struggle with poor calibration and a limited capacity to capture emotional intensity and complexity. To address this, we propose Ranking the Emotional Nuance for Theory of Mind (Rank-O-ToM), a framework that leverages ordinal ranking to align confidence levels with the emotional spectrum. By incorporating synthetic samples reflecting diverse affective complexities, Rank-O-ToM enhances the nuanced understanding of emotions, advancing AI's ability to reason about affective states.

Paper Structure

This paper contains 11 sections, 9 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: (a) Affective ToM challenge: interpreting nuanced emotional states, such as blended emotions. (b) Rank-O-ToM blends basic expressions into synthetic samples with ranked confidence scores to capture the emotional spectrum.
  • Figure 2: CAMs and confidence scores for FR, FER, and synthetic samples, showing activations for ambivalent (top) and similar (bottom) expressions with predicted labels.
  • Figure 3: Confidence heatmaps for RAF-DB compound set: Basic expressions (x-axis) and compound expressions (y-axis) with bold squares marking correct Top-2 matches.
  • Figure 4: Class proportions of FER datasets The pie charts display results for (a) RAF-DB (Basic), (b) RAF-DB (Compound), (c) FERPlus, (d) AffectNet.
  • Figure 5: Additional heatmaps for RAF-DB compound set. Basic expressions (x-axis) and compound expressions (y-axis) are depicted, with bold squares marking correct Top-2 matches. SCN and EAC models are presented here as supplementary to the main text, which focuses on RAC and Rank-O-ToM.
  • ...and 1 more figures