Latent Distribution Decoupling: A Probabilistic Framework for Uncertainty-Aware Multimodal Emotion Recognition
Jingwang Huang, Jiang Zhong, Qin Lei, Jinpeng Gao, Yuming Yang, Sirui Wang, Peiguang Li, Kaiwen Wei
TL;DR
This work tackles multimodal multi-label emotion recognition by modeling aleatoric uncertainty in a latent emotion space. It introduces LDDU, combining a Transformer-based unimodal extractor, a contrastive disentangled representation in a Gaussian latent space, and an uncertainty-aware fusion with calibration to integrate semantic features with distributional uncertainty. The approach yields state-of-the-art results on CMU-MOSEI and M^3ED, with notable improvements on unaligned data and robust performance across metrics, validating the importance of uncertainty modeling in MMER. The framework advances affective computing by providing probabilistic emotion-space modeling, distribution-aware fusion, and calibrated uncertainty as core design principles, enabling more robust and interpretable multimodal emotion recognition.
Abstract
Multimodal multi-label emotion recognition (MMER) aims to identify the concurrent presence of multiple emotions in multimodal data. Existing studies primarily focus on improving fusion strategies and modeling modality-to-label dependencies. However, they often overlook the impact of \textbf{aleatoric uncertainty}, which is the inherent noise in the multimodal data and hinders the effectiveness of modality fusion by introducing ambiguity into feature representations. To address this issue and effectively model aleatoric uncertainty, this paper proposes Latent emotional Distribution Decomposition with Uncertainty perception (LDDU) framework from a novel perspective of latent emotional space probabilistic modeling. Specifically, we introduce a contrastive disentangled distribution mechanism within the emotion space to model the multimodal data, allowing for the extraction of semantic features and uncertainty. Furthermore, we design an uncertainty-aware fusion multimodal method that accounts for the dispersed distribution of uncertainty and integrates distribution information. Experimental results show that LDDU achieves state-of-the-art performance on the CMU-MOSEI and M$^3$ED datasets, highlighting the importance of uncertainty modeling in MMER. Code is available at https://github.com/201983290498/lddu\_mmer.git.
