Table of Contents
Fetching ...

Fine-Grained 3D Facial Reconstruction for Micro-Expressions

Che Sun, Xinjie Zhang, Rui Gao, Xu Chen, Yuwei Wu, Yunde Jia

TL;DR

A plug-and-play dynamic-encoded module to extract micro-expression feature for global facial action is devised, allowing it to leverage prior knowledge from abundant macro-expression data to mitigate the scarcity of micro-expression data.

Abstract

Recent advances in 3D facial expression reconstruction have demonstrated remarkable performance in capturing macro-expressions, yet the reconstruction of micro-expressions remains unexplored. This novel task is particularly challenging due to the subtle, transient, and low-intensity nature of micro-expressions, which complicate the extraction of stable and discriminative features essential for accurate reconstruction. In this paper, we propose a fine-grained micro-expression reconstruction method that integrates a global dynamic feature capturing stable facial motion patterns with a locally-enriched feature incorporating multiple informative cues from 2D motions, facial priors and 3D facial geometry. Specifically, we devise a plug-and-play dynamic-encoded module to extract micro-expression feature for global facial action, allowing it to leverage prior knowledge from abundant macro-expression data to mitigate the scarcity of micro-expression data. Subsequently, a dynamic-guided mesh deformation module is designed for extracting aggregated local features from dense optical flow, sparse landmark cues and facial mesh geometry, which adaptively refines fine-grained facial micro-expression without compromising global 3D geometry. Extensive experiments on micro-expression datasets demonstrate that our method consistently outperforms state-of-the-art methods in both geometric accuracy and perceptual detail.

Fine-Grained 3D Facial Reconstruction for Micro-Expressions

TL;DR

A plug-and-play dynamic-encoded module to extract micro-expression feature for global facial action is devised, allowing it to leverage prior knowledge from abundant macro-expression data to mitigate the scarcity of micro-expression data.

Abstract

Recent advances in 3D facial expression reconstruction have demonstrated remarkable performance in capturing macro-expressions, yet the reconstruction of micro-expressions remains unexplored. This novel task is particularly challenging due to the subtle, transient, and low-intensity nature of micro-expressions, which complicate the extraction of stable and discriminative features essential for accurate reconstruction. In this paper, we propose a fine-grained micro-expression reconstruction method that integrates a global dynamic feature capturing stable facial motion patterns with a locally-enriched feature incorporating multiple informative cues from 2D motions, facial priors and 3D facial geometry. Specifically, we devise a plug-and-play dynamic-encoded module to extract micro-expression feature for global facial action, allowing it to leverage prior knowledge from abundant macro-expression data to mitigate the scarcity of micro-expression data. Subsequently, a dynamic-guided mesh deformation module is designed for extracting aggregated local features from dense optical flow, sparse landmark cues and facial mesh geometry, which adaptively refines fine-grained facial micro-expression without compromising global 3D geometry. Extensive experiments on micro-expression datasets demonstrate that our method consistently outperforms state-of-the-art methods in both geometric accuracy and perceptual detail.
Paper Structure (22 sections, 26 equations, 4 figures, 2 tables)

This paper contains 22 sections, 26 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: The overall framework of our method.
  • Figure 2: The dynamic-encoded module generates initialized 3D facial meshes from an onset image and input video.
  • Figure 3: The dynamic-guided mesh deformation module refines the initialized meshes using locally-enriched features to reconstruct subtle dynamic details of micro-expressions.
  • Figure 4: Visualizations of our method, SMIRK with micro-expression data fine-tuning (SMIRK-FT), and SMIRK without fine-tuning. The red boxes highlight the facial regions where noticeable micro-expression changes occur at $I_k$.