MIDAS: Mixing Ambiguous Data with Soft Labels for Dynamic Facial Expression Recognition

Ryosuke Kawamura; Hideaki Hayashi; Noriko Takemura; Hajime Nagahara

MIDAS: Mixing Ambiguous Data with Soft Labels for Dynamic Facial Expression Recognition

Ryosuke Kawamura, Hideaki Hayashi, Noriko Takemura, Hajime Nagahara

TL;DR

MIDAS tackles ambiguity in dynamic facial expression recognition by augmenting training data with soft-label mixtures of video frames. By convexly combining frames from distinct clips and their soft emotion distributions, MIDAS extends mixup to the video domain and unknown true hard labels, formalized through a vicinal risk framework using a beta-distributed mixing ratio. Empirical results on the DFEW dataset show MIDAS surpasses state-of-the-art methods in both WAR and UAR, including improved performance on underrepresented emotions and cross-dataset generalization to AFEW. The findings suggest soft-label data augmentation with mixing is a robust strategy for real-world FER where annotator disagreements and temporal co-occurrence of emotions are common.

Abstract

Dynamic facial expression recognition (DFER) is an important task in the field of computer vision. To apply automatic DFER in practice, it is necessary to accurately recognize ambiguous facial expressions, which often appear in data in the wild. In this paper, we propose MIDAS, a data augmentation method for DFER, which augments ambiguous facial expression data with soft labels consisting of probabilities for multiple emotion classes. In MIDAS, the training data are augmented by convexly combining pairs of video frames and their corresponding emotion class labels, which can also be regarded as an extension of mixup to soft-labeled video data. This simple extension is remarkably effective in DFER with ambiguous facial expression data. To evaluate MIDAS, we conducted experiments on the DFEW dataset. The results demonstrate that the model trained on the data augmented by MIDAS outperforms the existing state-of-the-art method trained on the original dataset.

MIDAS: Mixing Ambiguous Data with Soft Labels for Dynamic Facial Expression Recognition

TL;DR

Abstract

MIDAS: Mixing Ambiguous Data with Soft Labels for Dynamic Facial Expression Recognition

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (6)