Modelling the Interplay of Eye-Tracking Temporal Dynamics and Personality for Emotion Detection in Face-to-Face Settings
Meisam J. Seikavandi, Jostein Fimland, Fabricio Batista Narcizo, Maria Barrett, Ted Vucurevich, Jesper Bünsow Boldt, Andrew Burke Dittberner, Paolo Burelli
TL;DR
This work tackles dynamic emotion recognition in face-to-face-like settings by distinguishing perceived and felt emotions from a listener perspective. It proposes a personality-aware multimodal architecture that fuses temporal eye-tracking with Big Five traits and contextual stimulus cues from talking-face stimuli. Empirical results with 73 participants show that stimulus cues boost perceived-emotion predictions while personality traits substantially improve felt-emotion recognition, with macro F1 up to 0.58 for felt valence. These findings support a layered BET–TCE framework and point to more personalized, ecologically valid affective computing systems.
Abstract
Accurate recognition of human emotions is critical for adaptive human-computer interaction, yet remains challenging in dynamic, conversation-like settings. This work presents a personality-aware multimodal framework that integrates eye-tracking sequences, Big Five personality traits, and contextual stimulus cues to predict both perceived and felt emotions. Seventy-three participants viewed speech-containing clips from the CREMA-D dataset while providing eye-tracking signals, personality assessments, and emotion ratings. Our neural models captured temporal gaze dynamics and fused them with trait and stimulus information, yielding consistent gains over SVM and literature baselines. Results show that (i) stimulus cues strongly enhance perceived-emotion predictions (macro F1 up to 0.77), while (ii) personality traits provide the largest improvements for felt emotion recognition (macro F1 up to 0.58). These findings highlight the benefit of combining physiological, trait-level, and contextual information to address the inherent subjectivity of emotion. By distinguishing between perceived and felt responses, our approach advances multimodal affective computing and points toward more personalized and ecologically valid emotion-aware systems.
