Table of Contents
Fetching ...

FaceSleuth-R: Adaptive Orientation-Aware Attention for Robust Micro-Expression Recognition

Linquan Wu, Tianxiang Jiang, Haoyu Yang, Wenhao Duan, Shaochao Lin, Zixuan Wang, Yini Fang, Jacky Keung

TL;DR

FaceSleuth-R tackles MER generalization by shifting attention from spatial appearance cues to domain-invariant motion orientations via the Single-Orientation Attention (SOA) module. The dual-stream architecture decouples motion and appearance, with SOA learning per-layer orientation $\theta$ to reweight feature maps toward robust directional cues, yielding a near-vertical prior around $88^\circ$. Across LOSO intra-dataset and LODO cross-dataset evaluations on CASME II, SAMM, and SMIC-HS, the method achieves state-of-the-art results and significantly narrows generalization gaps, demonstrating strong resilience to domain shifts. The work highlights orientation-aware attention as a key paradigm for truly generalized MER with practical impact.

Abstract

Micro-expression recognition (MER) has achieved impressive accuracy in controlled laboratory settings. However, its real-world applicability faces a significant generalization cliff, severely hindering practical deployment due to poor performance on unseen data and susceptibility to domain shifts. Existing attention mechanisms often overfit to dataset-specific appearance cues or rely on fixed spatial priors, making them fragile in diverse environments. We posit that robust MER requires focusing on quasi-invariant motion orientations inherent to micro-expressions, rather than superficial pixel-level features. To this end, we introduce \textbf{FaceSleuth-R}, a framework centered on our novel \textbf{Single-Orientation Attention (SOA)} module. SOA is a lightweight, differentiable operator that enables the network to learn layer-specific optimal orientations, effectively guiding attention towards these robust motion cues. Through extensive experiments, we demonstrate that SOA consistently discovers a universal near-vertical motion prior across diverse datasets. More critically, FaceSleuth-R showcases superior generalization in rigorous Leave-One-Dataset-Out (LODO) protocols, significantly outperforming baselines and state-of-the-art methods when confronted with domain shifts. Furthermore, our approach establishes \textbf{state-of-the-art results} across several benchmarks. This work highlights adaptive orientation-aware attention as a key paradigm for developing truly generalized and high-performing MER systems.

FaceSleuth-R: Adaptive Orientation-Aware Attention for Robust Micro-Expression Recognition

TL;DR

FaceSleuth-R tackles MER generalization by shifting attention from spatial appearance cues to domain-invariant motion orientations via the Single-Orientation Attention (SOA) module. The dual-stream architecture decouples motion and appearance, with SOA learning per-layer orientation to reweight feature maps toward robust directional cues, yielding a near-vertical prior around . Across LOSO intra-dataset and LODO cross-dataset evaluations on CASME II, SAMM, and SMIC-HS, the method achieves state-of-the-art results and significantly narrows generalization gaps, demonstrating strong resilience to domain shifts. The work highlights orientation-aware attention as a key paradigm for truly generalized MER with practical impact.

Abstract

Micro-expression recognition (MER) has achieved impressive accuracy in controlled laboratory settings. However, its real-world applicability faces a significant generalization cliff, severely hindering practical deployment due to poor performance on unseen data and susceptibility to domain shifts. Existing attention mechanisms often overfit to dataset-specific appearance cues or rely on fixed spatial priors, making them fragile in diverse environments. We posit that robust MER requires focusing on quasi-invariant motion orientations inherent to micro-expressions, rather than superficial pixel-level features. To this end, we introduce \textbf{FaceSleuth-R}, a framework centered on our novel \textbf{Single-Orientation Attention (SOA)} module. SOA is a lightweight, differentiable operator that enables the network to learn layer-specific optimal orientations, effectively guiding attention towards these robust motion cues. Through extensive experiments, we demonstrate that SOA consistently discovers a universal near-vertical motion prior across diverse datasets. More critically, FaceSleuth-R showcases superior generalization in rigorous Leave-One-Dataset-Out (LODO) protocols, significantly outperforming baselines and state-of-the-art methods when confronted with domain shifts. Furthermore, our approach establishes \textbf{state-of-the-art results} across several benchmarks. This work highlights adaptive orientation-aware attention as a key paradigm for developing truly generalized and high-performing MER systems.

Paper Structure

This paper contains 9 sections, 3 equations, 3 figures, 2 tables, 1 algorithm.

Figures (3)

  • Figure 1: Our proposed FaceSleuth achieves state-of-the-art performance and strong generalization ability. (a) The learned orientation $\theta$ consistently converges to a near-vertical prior ($\sim$88$^\circ$) across benchmarks. (b) Our model outperforms recent SOTA methods in intra-dataset evaluations. (c) On the more challenging LODO task, an ablation study further demonstrates that our key components significantly improve the model's generalization to unseen datasets.
  • Figure 2: FaceSleuth-R: A dual-stream framework that enhances generalization by separating motion (SOA($\theta_i$)) and appearance (FPF) cues. The features are fused and passed to a classifier to yield $\Delta_{\mathrm{final}}$.
  • Figure 3: Visualization of attention maps generated by the SOA.