VRMM: A Volumetric Relightable Morphable Head Model
Haotian Yang, Mingwu Zheng, Chongyang Ma, Yu-Kun Lai, Pengfei Wan, Haibin Huang
TL;DR
VRMM introduces a volumetric morphable head model with disentangled, low-dimensional codes for identity $z_{id}$, expression $z_{e}$, and illumination $l$, trained in a self-supervised framework on dynamic multi-view data. Built on Mixture of Volumetric Primitives and a physically inspired relighting decoder, VRMM jointly learns a multi-identity head with decoders for mesh, identity, transformation, opacity, and color, employing a detach-concatenate strategy to stabilize training. A novel disentangled training regime, including an expression-consistency loss $\mathcal{L}_{exp}$ and KL regularization $\mathcal{L}_{KLD}$, enables robust, relightable, and animatable avatar reconstruction from few-shot inputs, with a prior-preserving fine-tuning stage to mitigate overfitting. Extensive experiments on a 254-subject dataset show state-of-the-art performance for novel view synthesis and single-view reconstruction, and demonstrate effective avatar personalization and relighting across scenes. VRMM thus provides a scalable, practical pathway to high-fidelity, controllable 3D facial avatars for applications in avatar creation, animation, and telepresence.
Abstract
In this paper, we introduce the Volumetric Relightable Morphable Model (VRMM), a novel volumetric and parametric facial prior for 3D face modeling. While recent volumetric prior models offer improvements over traditional methods like 3D Morphable Models (3DMMs), they face challenges in model learning and personalized reconstructions. Our VRMM overcomes these by employing a novel training framework that efficiently disentangles and encodes latent spaces of identity, expression, and lighting into low-dimensional representations. This framework, designed with self-supervised learning, significantly reduces the constraints for training data, making it more feasible in practice. The learned VRMM offers relighting capabilities and encompasses a comprehensive range of expressions. We demonstrate the versatility and effectiveness of VRMM through various applications like avatar generation, facial reconstruction, and animation. Additionally, we address the common issue of overfitting in generative volumetric models with a novel prior-preserving personalization framework based on VRMM. Such an approach enables high-quality 3D face reconstruction from even a single portrait input. Our experiments showcase the potential of VRMM to significantly enhance the field of 3D face modeling.
