Table of Contents
Fetching ...

HeadCraft: Modeling High-Detail Shape Variations for Animated 3DMMs

Artem Sevastopolsky, Philip-William Grassal, Simon Giebenhain, ShahRukh Athar, Luisa Verdoliva, Matthias Niessner

TL;DR

HeadCraft addresses the challenge of producing high-detail, animatable 3D head models by marrying an explicit parametric head prior with a learned UV-space displacement field. It performs a two-stage registration to fit high-frequency displacements onto a FLAME-based template using the NPHM dataset, then trains StyleGAN2-ADA to model a distribution over UV displacement maps $U=f(z)$, enabling unconditional generation and fitting to partial depth observations. The approach supports semantic editing, interpolation, and depth-based completion, while preserving animation compatibility through the FLAME rig. Empirically, HeadCraft demonstrates competitive fidelity and diversity against SDF-based heads and related baselines, while enabling detailed hair and scalp geometry to be generated and animated within standard graphics pipelines.

Abstract

Current advances in human head modeling allow the generation of plausible-looking 3D head models via neural representations, such as NeRFs and SDFs. Nevertheless, constructing complete high-fidelity head models with explicitly controlled animation remains an issue. Furthermore, completing the head geometry based on a partial observation, e.g., coming from a depth sensor, while preserving a high level of detail is often problematic for the existing methods. We introduce a generative model for detailed 3D head meshes on top of an articulated 3DMM, simultaneously allowing explicit animation and high-detail preservation. Our method is trained in two stages. First, we register a parametric head model with vertex displacements to each mesh of the recently introduced NPHM dataset of accurate 3D head scans. The estimated displacements are baked into a hand-crafted UV layout. Second, we train a StyleGAN model to generalize over the UV maps of displacements, which we later refer to as HeadCraft. The decomposition of the parametric model and high-quality vertex displacements allows us to animate the model and modify the regions semantically. We demonstrate the results of unconditional sampling, fitting to a scan and editing. The project page is available at https://seva100.github.io/headcraft.

HeadCraft: Modeling High-Detail Shape Variations for Animated 3DMMs

TL;DR

HeadCraft addresses the challenge of producing high-detail, animatable 3D head models by marrying an explicit parametric head prior with a learned UV-space displacement field. It performs a two-stage registration to fit high-frequency displacements onto a FLAME-based template using the NPHM dataset, then trains StyleGAN2-ADA to model a distribution over UV displacement maps , enabling unconditional generation and fitting to partial depth observations. The approach supports semantic editing, interpolation, and depth-based completion, while preserving animation compatibility through the FLAME rig. Empirically, HeadCraft demonstrates competitive fidelity and diversity against SDF-based heads and related baselines, while enabling detailed hair and scalp geometry to be generated and animated within standard graphics pipelines.

Abstract

Current advances in human head modeling allow the generation of plausible-looking 3D head models via neural representations, such as NeRFs and SDFs. Nevertheless, constructing complete high-fidelity head models with explicitly controlled animation remains an issue. Furthermore, completing the head geometry based on a partial observation, e.g., coming from a depth sensor, while preserving a high level of detail is often problematic for the existing methods. We introduce a generative model for detailed 3D head meshes on top of an articulated 3DMM, simultaneously allowing explicit animation and high-detail preservation. Our method is trained in two stages. First, we register a parametric head model with vertex displacements to each mesh of the recently introduced NPHM dataset of accurate 3D head scans. The estimated displacements are baked into a hand-crafted UV layout. Second, we train a StyleGAN model to generalize over the UV maps of displacements, which we later refer to as HeadCraft. The decomposition of the parametric model and high-quality vertex displacements allows us to animate the model and modify the regions semantically. We demonstrate the results of unconditional sampling, fitting to a scan and editing. The project page is available at https://seva100.github.io/headcraft.
Paper Structure (17 sections, 12 equations, 21 figures, 17 tables)

This paper contains 17 sections, 12 equations, 21 figures, 17 tables.

Figures (21)

  • Figure 1: We present HeadCraft, a generative model for highly-detailed human heads, ready for animation. Our method is trained on 2D displacement maps collected by registering a parametric template head with free surface displacements to a large set of 3D head scans. The resulting model is highly versatile and its latent code can be fit to an arbitrary depth observation.
  • Figure 2: An overview of the method. In the registration stage, we (a) fit the FLAME template by the face landmarks to the scan from the NPHM dataset and highly subdivide it, (b) optimize for the vertex displacements in $\mathbb{R}^3$ to fit the rough geometry with strong regularizations, (c) optimize for the scalar refinements of the displacements along the normal directions, and (d) bake the displacements into a UV offset map. To generalize over the UV offset maps, we train a StyleGAN2 stylegan2 model. After training, the offsets can be applied to an arbitrary FLAME template by subdividing it and (e) querying the generated UV offset map with the (u, v) locations of the FLAME vertices.
  • Figure 3: Visual comparison of fidelity and diversity of the meshes generated by various methods. For Ours, random FLAMEs are sampled from Gaussian distribution with statistics calculated over the NPHM dataset; same for the PCA baseline pre-fitted to our UV registrations. Meshes from NPHM are obtained by sampling the latent codes and running marching cubes over the generated SDF representations. We demonstrate higher variability of produced head geometry and better details than the other methods.
  • Figure 4: Randomly generated samples from HeadCraft and the corresponding nearest neighbors in the NPHM dataset among the scans used for training. $L_2$ distance over the scalp part of the displacement maps was used. Displacements were added to a random FLAME template for all samples.
  • Figure 5: Ablation over the one-stage vs. two-stage registration. Regressing only vector displacements (a) yields too smooth geometry, and learning them only along the normals (b) introduces spikes -- just like running the first stage with smaller $\boldsymbol{\lambda}$ (c).
  • ...and 16 more figures