AMMSM: Adaptive Motion Magnification and Sparse Mamba for Micro-Expression Recognition
Xuxiong Liu, Tengteng Dong, Fei Wang, Weijie Feng, Xiao Sun
TL;DR
AMMSM addresses the challenge of recognizing subtle micro-expressions by integrating adaptive, self-supervised motion magnification with a Sparse Mamba backbone that selects motion-critical regions. The framework jointly optimizes magnification and sparsity through evolutionary search and end-to-end training, achieving state-of-the-art performance on CASME II and SAMM with strong robustness. Key contributions include the Adaptive Motion Magnification module, the Sparse State Space Duality block, and the adaptive configuration search, all validated via extensive ablations and LOSO benchmarks. This approach offers a scalable, efficient path for high-precision MER in real-world, resource-constrained settings.
Abstract
Micro-expressions are typically regarded as unconscious manifestations of a person's genuine emotions. However, their short duration and subtle signals pose significant challenges for downstream recognition. We propose a multi-task learning framework named the Adaptive Motion Magnification and Sparse Mamba (AMMSM) to address this. This framework aims to enhance the accurate capture of micro-expressions through self-supervised subtle motion magnification, while the sparse spatial selection Mamba architecture combines sparse activation with the advanced Visual Mamba model to model key motion regions and their valuable representations more effectively. Additionally, we employ evolutionary search to optimize the magnification factor and the sparsity ratios of spatial selection, followed by fine-tuning to improve performance further. Extensive experiments on two standard datasets demonstrate that the proposed AMMSM achieves state-of-the-art (SOTA) accuracy and robustness.
