Table of Contents
Fetching ...

GaussianAvatars: Photorealistic Head Avatars with Rigged 3D Gaussians

Shenhan Qian, Tobias Kirschstein, Liam Schoneveld, Davide Davoli, Simon Giebenhain, Matthias Nießner

TL;DR

GaussianAvatars tackles the challenge of producing photorealistic, controllable head avatars from multi-view video by rigging anisotropic 3D Gaussian splats to a FLAME morphable head model and optimizing them end-to-end. The method introduces a binding inheritance mechanism and adaptive density control to preserve controllability while enriching detail, enabling accurate animation under novel expressions and poses. Empirical results on NeRSemble show substantial gains in novel-view rendering and cross-identity reenactment over state-of-the-art baselines, with efficient training and inference. This work advances practical head avatars for immersive media and telepresence and highlights future directions for including hair and lighting flexibility.

Abstract

We introduce GaussianAvatars, a new method to create photorealistic head avatars that are fully controllable in terms of expression, pose, and viewpoint. The core idea is a dynamic 3D representation based on 3D Gaussian splats that are rigged to a parametric morphable face model. This combination facilitates photorealistic rendering while allowing for precise animation control via the underlying parametric model, e.g., through expression transfer from a driving sequence or by manually changing the morphable model parameters. We parameterize each splat by a local coordinate frame of a triangle and optimize for explicit displacement offset to obtain a more accurate geometric representation. During avatar reconstruction, we jointly optimize for the morphable model parameters and Gaussian splat parameters in an end-to-end fashion. We demonstrate the animation capabilities of our photorealistic avatar in several challenging scenarios. For instance, we show reenactments from a driving video, where our method outperforms existing works by a significant margin.

GaussianAvatars: Photorealistic Head Avatars with Rigged 3D Gaussians

TL;DR

GaussianAvatars tackles the challenge of producing photorealistic, controllable head avatars from multi-view video by rigging anisotropic 3D Gaussian splats to a FLAME morphable head model and optimizing them end-to-end. The method introduces a binding inheritance mechanism and adaptive density control to preserve controllability while enriching detail, enabling accurate animation under novel expressions and poses. Empirical results on NeRSemble show substantial gains in novel-view rendering and cross-identity reenactment over state-of-the-art baselines, with efficient training and inference. This work advances practical head avatars for immersive media and telepresence and highlights future directions for including hair and lighting flexibility.

Abstract

We introduce GaussianAvatars, a new method to create photorealistic head avatars that are fully controllable in terms of expression, pose, and viewpoint. The core idea is a dynamic 3D representation based on 3D Gaussian splats that are rigged to a parametric morphable face model. This combination facilitates photorealistic rendering while allowing for precise animation control via the underlying parametric model, e.g., through expression transfer from a driving sequence or by manually changing the morphable model parameters. We parameterize each splat by a local coordinate frame of a triangle and optimize for explicit displacement offset to obtain a more accurate geometric representation. During avatar reconstruction, we jointly optimize for the morphable model parameters and Gaussian splat parameters in an end-to-end fashion. We demonstrate the animation capabilities of our photorealistic avatar in several challenging scenarios. For instance, we show reenactments from a driving video, where our method outperforms existing works by a significant margin.
Paper Structure (19 sections, 8 equations, 8 figures, 5 tables)

This paper contains 19 sections, 8 equations, 8 figures, 5 tables.

Figures (8)

  • Figure 1: Overview. Our method binds 3D Gaussian splats to a FLAME FLAME:SiggraphAsia2017 mesh locally. We take the tracked mesh for each frame and transform the splats from local to global space before rendering them with 3D Gaussian Splatting kerbl20233d. We optimize the splats in the local space by minimizing color loss on the rendering. We add and remove splats adaptively with their binding relation to triangles inherited so that all splats remain rigged throughout the optimization procedure. Further, we regularize the position and scaling of 3D Gaussian splats to suppress artifacts during animation.
  • Figure 2: Qualitative comparison on novel-view synthesis and self-reenactment of head avatars. Our method outperforms state-of-the-art methods by producing significantly sharper rendering outputs. We obtain precise reconstruction of details such as reflective light on eyes, hair strands, teeth, etc. Our results for self-reenactment show more accurate expressions compare to baselines.
  • Figure 3: Cross-identity reenactment of head avatars. We use the tracked FLAME expression and pose parameters of source actors to drive the reconstructed avatars. Our method produces high-quality rendering and transfers expressions vividly, while baseline methods suffer from artifacts and generalize poorly to novel expressions.
  • Figure 4: Fine-tuning FLAME parameters leads to better alignment of the mesh to the input image. In the example above, the movements of cheeks and lips are better captured after fine-tuning.
  • Figure 5: The position loss and the scaling loss helps prevent artifacts during animation with novel expressions and poses.
  • ...and 3 more figures