MagicFace: High-Fidelity Facial Expression Editing with Action-Unit Control
Mengting Wei, Tuomas Varanka, Xingxun Jiang, Huai-Qian Khor, Guoying Zhao
TL;DR
MagicFace addresses the problem of high-fidelity facial expression editing by conditioning a diffusion model on action-unit (AU) variations while preserving identity, pose, and background. It introduces an ID encoder that merges identity features via self-attention and an Attribute Controller to separate background/pose from facial edits, enabling precise and continuous AU-driven editing across arbitrary identities. The approach uses AU variations defined as $\mathbf{c}_{AU} = \mathbf{c}_{ID} - \mathbf{c}_{tgt}$ and employs AU dropout with classifier-free guidance, trained on 30K Aff-Wild identity pairs with an AU-edit loss $\mathcal{L}_{AUEdit}$, achieving strong AU accuracy and robust identity preservation, even in out-of-domain scenarios. The work demonstrates practical, user-friendly facial expression editing with potential applications in avatars and digital media, while acknowledging societal implications and the need for safeguards against misuse.
Abstract
We address the problem of facial expression editing by controling the relative variation of facial action-unit (AU) from the same person. This enables us to edit this specific person's expression in a fine-grained, continuous and interpretable manner, while preserving their identity, pose, background and detailed facial attributes. Key to our model, which we dub MagicFace, is a diffusion model conditioned on AU variations and an ID encoder to preserve facial details of high consistency. Specifically, to preserve the facial details with the input identity, we leverage the power of pretrained Stable-Diffusion models and design an ID encoder to merge appearance features through self-attention. To keep background and pose consistency, we introduce an efficient Attribute Controller by explicitly informing the model of current background and pose of the target. By injecting AU variations into a denoising UNet, our model can animate arbitrary identities with various AU combinations, yielding superior results in high-fidelity expression editing compared to other facial expression editing works. Code is publicly available at https://github.com/weimengting/MagicFace.
