MERBench: A Unified Evaluation Benchmark for Multimodal Emotion Recognition
Zheng Lian, Licai Sun, Yong Ren, Hao Gu, Haiyang Sun, Lan Chen, Bin Liu, Jianhua Tao
TL;DR
MERBench tackles the lack of fair comparison in multimodal emotion recognition by introducing a unified benchmark and the Chinese MER2023 dataset. It systematically evaluates unimodal and multimodal features, fusion strategies, cross-corpus transfer, and robustness to punctuation and noise, under a standardized pipeline. Key contributions include a rigorous baseline suite, extensive cross-dataset analyses, and recommendations favoring pre-training and language-aware encoders, plus a public codebase. The work provides a practical framework for reproducible research and highlights directions for robust, scalable emotion recognition in real-world, multilingual settings.
Abstract
Multimodal emotion recognition plays a crucial role in enhancing user experience in human-computer interaction. Over the past few decades, researchers have proposed a series of algorithms and achieved impressive progress. Although each method shows its superior performance, different methods lack a fair comparison due to inconsistencies in feature extractors, evaluation manners, and experimental settings. These inconsistencies severely hinder the development of this field. Therefore, we build MERBench, a unified evaluation benchmark for multimodal emotion recognition. We aim to reveal the contribution of some important techniques employed in previous works, such as feature selection, multimodal fusion, robustness analysis, fine-tuning, pre-training, etc. We hope this benchmark can provide clear and comprehensive guidance for follow-up researchers. Based on the evaluation results of MERBench, we further point out some promising research directions. Additionally, we introduce a new emotion dataset MER2023, focusing on the Chinese language environment. This dataset can serve as a benchmark dataset for research on multi-label learning, noise robustness, and semi-supervised learning. We encourage the follow-up researchers to evaluate their algorithms under the same experimental setup as MERBench for fair comparisons. Our code is available at: https://github.com/zeroQiaoba/MERTools.
