Animatable and Relightable Gaussians for High-fidelity Human Avatar Modeling
Zhe Li, Yipengjing Sun, Zerong Zheng, Lizhen Wang, Shengping Zhang, Yebin Liu
TL;DR
This work tackles lifelike animatable human avatar modeling from RGB videos by introducing Animatable Gaussians, an explicit 3D Gaussian splatting framework coupled with 2D CNNs. A character-specific parametric template is learned and projected onto front/back Gaussian maps, enabling high-fidelity pose-dependent dynamics while enabling efficient rendering. The method further incorporates a PCA-based pose projection for novel poses and a physically-based rendering pipeline to disentangle geometry, material, and lighting for realistic relighting under new illumination. Experiments across THuman4.0, AvatarReX, and ActorsHQ demonstrate superior animation quality and relighting fidelity compared with NeRF-based and prior Gaussian-based avatars, particularly in modeling loose garments like dresses. This approach advances practical 3D human avatars for holoportation and extended reality by delivering lifelike, relightable, and generalized appearances with efficient rendering.
Abstract
Modeling animatable human avatars from RGB videos is a long-standing and challenging problem. Recent works usually adopt MLP-based neural radiance fields (NeRF) to represent 3D humans, but it remains difficult for pure MLPs to regress pose-dependent garment details. To this end, we introduce Animatable Gaussians, a new avatar representation that leverages powerful 2D CNNs and 3D Gaussian splatting to create high-fidelity avatars. To associate 3D Gaussians with the animatable avatar, we learn a parametric template from the input videos, and then parameterize the template on two front & back canonical Gaussian maps where each pixel represents a 3D Gaussian. The learned template is adaptive to the wearing garments for modeling looser clothes like dresses. Such template-guided 2D parameterization enables us to employ a powerful StyleGAN-based CNN to learn the pose-dependent Gaussian maps for modeling detailed dynamic appearances. Furthermore, we introduce a pose projection strategy for better generalization given novel poses. To tackle the realistic relighting of animatable avatars, we introduce physically-based rendering into the avatar representation for decomposing avatar materials and environment illumination. Overall, our method can create lifelike avatars with dynamic, realistic, generalized and relightable appearances. Experiments show that our method outperforms other state-of-the-art approaches.
