Table of Contents
Fetching ...

HHAvatar: Gaussian Head Avatar with Dynamic Hairs

Zhanfeng Liao, Yuelang Xu, Zhe Li, Qijing Li, Boyao Zhou, Ruifeng Bai, Di Xu, Hongwen Zhang, Yebin Liu

TL;DR

This paper proposes HHAvatar represented by controllable 3D Gaussians for high-fidelity head avatar with dynamic hair modeling, and introduces a hybrid head model into the avatar representation based Gaussian Head Avatar and a training method that considers timing information and an occlusion perception module to model the non-rigid motion of hair.

Abstract

Creating high-fidelity 3D head avatars has always been a research hotspot, but it remains a great challenge under lightweight sparse view setups. In this paper, we propose HHAvatar represented by controllable 3D Gaussians for high-fidelity head avatar with dynamic hair modeling. We first use 3D Gaussians to represent the appearance of the head, and then jointly optimize neutral 3D Gaussians and a fully learned MLP-based deformation field to capture complex expressions. The two parts benefit each other, thereby our method can model fine-grained dynamic details while ensuring expression accuracy. Furthermore, we devise a well-designed geometry-guided initialization strategy based on implicit SDF and Deep Marching Tetrahedra for the stability and convergence of the training procedure. To address the problem of dynamic hair modeling, we introduce a hybrid head model into our avatar representation based Gaussian Head Avatar and a training method that considers timing information and an occlusion perception module to model the non-rigid motion of hair. Experiments show that our approach outperforms other state-of-the-art sparse-view methods, achieving ultra high-fidelity rendering quality at 2K resolution even under exaggerated expressions and driving hairs reasonably with the motion of the head

HHAvatar: Gaussian Head Avatar with Dynamic Hairs

TL;DR

This paper proposes HHAvatar represented by controllable 3D Gaussians for high-fidelity head avatar with dynamic hair modeling, and introduces a hybrid head model into the avatar representation based Gaussian Head Avatar and a training method that considers timing information and an occlusion perception module to model the non-rigid motion of hair.

Abstract

Creating high-fidelity 3D head avatars has always been a research hotspot, but it remains a great challenge under lightweight sparse view setups. In this paper, we propose HHAvatar represented by controllable 3D Gaussians for high-fidelity head avatar with dynamic hair modeling. We first use 3D Gaussians to represent the appearance of the head, and then jointly optimize neutral 3D Gaussians and a fully learned MLP-based deformation field to capture complex expressions. The two parts benefit each other, thereby our method can model fine-grained dynamic details while ensuring expression accuracy. Furthermore, we devise a well-designed geometry-guided initialization strategy based on implicit SDF and Deep Marching Tetrahedra for the stability and convergence of the training procedure. To address the problem of dynamic hair modeling, we introduce a hybrid head model into our avatar representation based Gaussian Head Avatar and a training method that considers timing information and an occlusion perception module to model the non-rigid motion of hair. Experiments show that our approach outperforms other state-of-the-art sparse-view methods, achieving ultra high-fidelity rendering quality at 2K resolution even under exaggerated expressions and driving hairs reasonably with the motion of the head
Paper Structure (17 sections, 24 equations, 13 figures, 5 tables)

This paper contains 17 sections, 24 equations, 13 figures, 5 tables.

Figures (13)

  • Figure 1: HHAvatar achieves ultra high-fidelity image synthesis with controllable expressions at 2K resolution. The above shows different identities animated by the same expression. The bottom shows that variations in hair positions can arise for identical poses, stemming from diverse hair status (i.e., position and speed) at the previous moment.
  • Figure 2: The pipeline of the HHAvatar rendering and reconstruction. We first optimize the guidance model including a neutral mesh, a deformation MLP and a color MLP in the Initialization stage. Then we use them to initialize the neutral Gaussians and the dynamic generator. Finally, 2K RGB images are synthesized through differentiable rendering and the super-resolution network, and the segmentation maps of the hair and the head are also synthesized through differentiable rendering. The HHAvatar are trained under the supervision of multi-view RGB videos and multi-view masks from face-parsing.
  • Figure 3: The detail of the temporal module. The input for the Hair Dynamic MLPs at time step $t$ is $\boldsymbol{X}_0$ (the position of the neutral Gaussian point), $\{X'_{t-1}, X'_{t-2}\}$ (the position of the expressive Gaussian point at time step $t-1$ and time step $t-2$), and $\{\beta_{t}, \beta_{t-1}, \beta_{t-2}\}$ (the pose of the head at time step $t$, $t-1$, and $t-2$). The specific details of the Hair Dynamic MLPs are detailed in Sec. \ref{['subsubsec:dynamichair']}.
  • Figure 4: Qualitative comparisons of different methods on self reenactment task in NeRSemble dataset zhao2023havatar. From left to right: NeRFBlendShape gao2022reconstructing, NeRFace gafni2021dynamic, HAvatar zhao2023havatar and Ours. Our method can reconstruct details like beards, teeth, eyes, etc. with high quality.
  • Figure 5: Qualitative comparisons of different methods on self reenactment task with dynamic hairs in the self-captured dataset. From left to right: HAvatar zhao2023havatar, GaussianAvatars qian2024gaussianavatars, MeGA wang2024mega and Ours. Our method can reconstruct details with high quality.
  • ...and 8 more figures