Table of Contents
Fetching ...

Generating Editable Head Avatars with 3D Gaussian GANs

Guohao Li, Hongyu Yang, Yifang Men, Di Huang, Weixin Li, Ruijie Yang, Yunhong Wang

TL;DR

This work introduces EG^2D3D, a 3D-aware GAN that leverages 3D Gaussian Splatting to generate editable and animatable 3D head avatars. It decouples facial and non-facial geometry into an Editable Gaussian Head (EG-Head) integrated with a 3D Morphable Model and texture maps, and a hair region represented by a shared Gaussian point cloud with tri-plane features, enabling precise expression control and flexible texture editing. The model injects illumination via Spherical Harmonics and a lightweight shadow CNN, and employs separate discriminators to stabilize training; experiments show improved animation control, illumination handling, and identity consistency with near real-time rendering. The approach demonstrates strong potential for applications in virtual reality, gaming, and content creation, while outlining avenues for data-driven improvements and extension to full-body avatars and physically based rendering.

Abstract

Generating animatable and editable 3D head avatars is essential for various applications in computer vision and graphics. Traditional 3D-aware generative adversarial networks (GANs), often using implicit fields like Neural Radiance Fields (NeRF), achieve photorealistic and view-consistent 3D head synthesis. However, these methods face limitations in deformation flexibility and editability, hindering the creation of lifelike and easily modifiable 3D heads. We propose a novel approach that enhances the editability and animation control of 3D head avatars by incorporating 3D Gaussian Splatting (3DGS) as an explicit 3D representation. This method enables easier illumination control and improved editability. Central to our approach is the Editable Gaussian Head (EG-Head) model, which combines a 3D Morphable Model (3DMM) with texture maps, allowing precise expression control and flexible texture editing for accurate animation while preserving identity. To capture complex non-facial geometries like hair, we use an auxiliary set of 3DGS and tri-plane features. Extensive experiments demonstrate that our approach delivers high-quality 3D-aware synthesis with state-of-the-art controllability. Our code and models are available at https://github.com/liguohao96/EGG3D.

Generating Editable Head Avatars with 3D Gaussian GANs

TL;DR

This work introduces EG^2D3D, a 3D-aware GAN that leverages 3D Gaussian Splatting to generate editable and animatable 3D head avatars. It decouples facial and non-facial geometry into an Editable Gaussian Head (EG-Head) integrated with a 3D Morphable Model and texture maps, and a hair region represented by a shared Gaussian point cloud with tri-plane features, enabling precise expression control and flexible texture editing. The model injects illumination via Spherical Harmonics and a lightweight shadow CNN, and employs separate discriminators to stabilize training; experiments show improved animation control, illumination handling, and identity consistency with near real-time rendering. The approach demonstrates strong potential for applications in virtual reality, gaming, and content creation, while outlining avenues for data-driven improvements and extension to full-body avatars and physically based rendering.

Abstract

Generating animatable and editable 3D head avatars is essential for various applications in computer vision and graphics. Traditional 3D-aware generative adversarial networks (GANs), often using implicit fields like Neural Radiance Fields (NeRF), achieve photorealistic and view-consistent 3D head synthesis. However, these methods face limitations in deformation flexibility and editability, hindering the creation of lifelike and easily modifiable 3D heads. We propose a novel approach that enhances the editability and animation control of 3D head avatars by incorporating 3D Gaussian Splatting (3DGS) as an explicit 3D representation. This method enables easier illumination control and improved editability. Central to our approach is the Editable Gaussian Head (EG-Head) model, which combines a 3D Morphable Model (3DMM) with texture maps, allowing precise expression control and flexible texture editing for accurate animation while preserving identity. To capture complex non-facial geometries like hair, we use an auxiliary set of 3DGS and tri-plane features. Extensive experiments demonstrate that our approach delivers high-quality 3D-aware synthesis with state-of-the-art controllability. Our code and models are available at https://github.com/liguohao96/EGG3D.
Paper Structure (12 sections, 4 figures, 2 tables)

This paper contains 12 sections, 4 figures, 2 tables.

Figures (4)

  • Figure 1: Overview of the proposed EG$^2$3D method. (a) The generator:$G_{tex}$ generates texture maps for the Editable Gaussian Head (EG-Head) representing the face region. $G_{tri}$ and $\text{MLP}_{pos}$ generate a Gaussian point cloud with tri-plane for the hair region, and a background generator $G_{back}$ is separately built. These Gaussian point clouds are merged and rendered directly at high resolution through Gaussian splatting with injected illumination information. (b) The discriminators: Separate discriminators $D_{mask}$ and $D_{img}$ help to learn the position and appearance of each Gaussian point clouds. (c) Applications with EG$^2$3D: Rendering with modified texture and exporting to 3D assets.
  • Figure 2: Comparison with state-of-the-art animatable 3D GANs on video driving.
  • Figure 3: Generated animatable and editable 3D avatars. Editing textures are courtesy of wsdf.
  • Figure 4: