Table of Contents
Fetching ...

MixedGaussianAvatar: Realistically and Geometrically Accurate Head Avatar via Mixed 2D-3D Gaussians

Peng Chen, Xiaobao Wei, Qingpo Wuwu, Xinyi Wang, Xingyu Xiao, Ming Lu

TL;DR

MixedGaussianAvatar presents a differentiable mixed 2D-3D Gaussian Splatting framework that jointly preserves geometric surface fidelity via 2D Gaussians anchored to a FLAME mesh and enhances color realism with targeted 3D Gaussians. A progressive two-stage training pipeline first solidifies geometry and then refines appearance, guided by a local-to-global transformation that supports animation. The approach achieves state-of-the-art rendering quality while delivering geometrically accurate head surfaces, validated on challenging multi-view and monocular datasets with strong ablation support. This work offers a practical, animatable head avatar representation that unifies image-based rendering and mesh reconstruction, with code to be released.

Abstract

Reconstructing high-fidelity 3D head avatars is crucial in various applications such as virtual reality. The pioneering methods reconstruct realistic head avatars with Neural Radiance Fields (NeRF), which have been limited by training and rendering speed. Recent methods based on 3D Gaussian Splatting (3DGS) significantly improve the efficiency of training and rendering. However, the surface inconsistency of 3DGS results in subpar geometric accuracy; later, 2DGS uses 2D surfels to enhance geometric accuracy at the expense of rendering fidelity. To leverage the benefits of both 2DGS and 3DGS, we propose a novel method named MixedGaussianAvatar for realistically and geometrically accurate head avatar reconstruction. Our main idea is to utilize 2D Gaussians to reconstruct the surface of the 3D head, ensuring geometric accuracy. We attach the 2D Gaussians to the triangular mesh of the FLAME model and connect additional 3D Gaussians to those 2D Gaussians where the rendering quality of 2DGS is inadequate, creating a mixed 2D-3D Gaussian representation. These 2D-3D Gaussians can then be animated using FLAME parameters. We further introduce a progressive training strategy that first trains the 2D Gaussians and then fine-tunes the mixed 2D-3D Gaussians. We use a unified mixed Gaussian representation to integrate the two modalities of 2D image and 3D mesh. Furthermore, the comprehensive experiments demonstrate the superiority of MixedGaussianAvatar. The code will be released.

MixedGaussianAvatar: Realistically and Geometrically Accurate Head Avatar via Mixed 2D-3D Gaussians

TL;DR

MixedGaussianAvatar presents a differentiable mixed 2D-3D Gaussian Splatting framework that jointly preserves geometric surface fidelity via 2D Gaussians anchored to a FLAME mesh and enhances color realism with targeted 3D Gaussians. A progressive two-stage training pipeline first solidifies geometry and then refines appearance, guided by a local-to-global transformation that supports animation. The approach achieves state-of-the-art rendering quality while delivering geometrically accurate head surfaces, validated on challenging multi-view and monocular datasets with strong ablation support. This work offers a practical, animatable head avatar representation that unifies image-based rendering and mesh reconstruction, with code to be released.

Abstract

Reconstructing high-fidelity 3D head avatars is crucial in various applications such as virtual reality. The pioneering methods reconstruct realistic head avatars with Neural Radiance Fields (NeRF), which have been limited by training and rendering speed. Recent methods based on 3D Gaussian Splatting (3DGS) significantly improve the efficiency of training and rendering. However, the surface inconsistency of 3DGS results in subpar geometric accuracy; later, 2DGS uses 2D surfels to enhance geometric accuracy at the expense of rendering fidelity. To leverage the benefits of both 2DGS and 3DGS, we propose a novel method named MixedGaussianAvatar for realistically and geometrically accurate head avatar reconstruction. Our main idea is to utilize 2D Gaussians to reconstruct the surface of the 3D head, ensuring geometric accuracy. We attach the 2D Gaussians to the triangular mesh of the FLAME model and connect additional 3D Gaussians to those 2D Gaussians where the rendering quality of 2DGS is inadequate, creating a mixed 2D-3D Gaussian representation. These 2D-3D Gaussians can then be animated using FLAME parameters. We further introduce a progressive training strategy that first trains the 2D Gaussians and then fine-tunes the mixed 2D-3D Gaussians. We use a unified mixed Gaussian representation to integrate the two modalities of 2D image and 3D mesh. Furthermore, the comprehensive experiments demonstrate the superiority of MixedGaussianAvatar. The code will be released.

Paper Structure

This paper contains 25 sections, 18 equations, 8 figures, 2 tables, 1 algorithm.

Figures (8)

  • Figure 1: MixedGaussianAvatar uses a mixed 2D-3D Gaussian Splatting method to reconstruct a realistically and geometrically accurate 3D head avatar mesh.
  • Figure 2: Pipeline of MixedGaussianAvatar. We propose a differentiable mixed 2D-3D Gaussian Splatting method for reconstructing realistically and geometrically accurate head avatars from multi-view 2D images. This approach uses 2D Gaussians for geometric precision and 3D Gaussians to correct color errors, resulting in high-quality 3D head mesh and realistic mesh textures or RGB images. The method is integrated with the FLAME model, enabling dynamic effects and animation driven by parameters through local-to-global transformation and the progressive training strategy.
  • Figure 3: The tree structure of mixed 2D-3D Gaussians.
  • Figure 4: Qualitative comparison on mesh reconstruction.
  • Figure 5: Qualitative comparison of image rendering.
  • ...and 3 more figures