Gaussian Deja-vu: Creating Controllable 3D Gaussian Head-Avatars with Enhanced Generalization and Personalization Abilities

Peizhi Yan; Rabab Ward; Qiang Tang; Shan Du

Gaussian Deja-vu: Creating Controllable 3D Gaussian Head-Avatars with Enhanced Generalization and Personalization Abilities

Peizhi Yan, Rabab Ward, Qiang Tang, Shan Du

TL;DR

The “Gaussian Déjà-vu” framework is introduced, which first obtains a generalized model of the head avatar and then personalizes the result, which outperforms state-of-the-art 3D Gaussian head avatars in terms of photorealistic quality and reduces training time consumption to at least a quarter of the existing methods.

Abstract

Recent advancements in 3D Gaussian Splatting (3DGS) have unlocked significant potential for modeling 3D head avatars, providing greater flexibility than mesh-based methods and more efficient rendering compared to NeRF-based approaches. Despite these advancements, the creation of controllable 3DGS-based head avatars remains time-intensive, often requiring tens of minutes to hours. To expedite this process, we here introduce the "Gaussian Deja-vu" framework, which first obtains a generalized model of the head avatar and then personalizes the result. The generalized model is trained on large 2D (synthetic and real) image datasets. This model provides a well-initialized 3D Gaussian head that is further refined using a monocular video to achieve the personalized head avatar. For personalizing, we propose learnable expression-aware rectification blendmaps to correct the initial 3D Gaussians, ensuring rapid convergence without the reliance on neural networks. Experiments demonstrate that the proposed method meets its objectives. It outperforms state-of-the-art 3D Gaussian head avatars in terms of photorealistic quality as well as reduces training time consumption to at least a quarter of the existing methods, producing the avatar in minutes.

Gaussian Deja-vu: Creating Controllable 3D Gaussian Head-Avatars with Enhanced Generalization and Personalization Abilities

TL;DR

Abstract

Paper Structure (17 sections, 11 equations, 9 figures, 4 tables)

This paper contains 17 sections, 11 equations, 9 figures, 4 tables.

Introduction
Related Works
3D Morphable Models in 3D Face Modeling
NeRF-Based 3D Face and Head Models
3D Gaussian-Based Head Avatars
Method
Background: 3D Gaussian Splatting
Our UV Gaussian Map Representation
Single-Image-Based Reconstruction
Monocular-Video-Based Optimization
Experiments
Datasets
Implementation Details
Single-Image-Based Reconstruction Results
Monocular-Video-Based Optimization Results
...and 2 more sections

Figures (9)

Figure 1: Gaussian Déjà-vu first trains a reconstruction model on large face image datasets and serves as a generalized base. This model initializes the 3D Gaussian head, which is then optimized to personalize the avatar to match the person in the video.
Figure 2: Detailed flowcharts for our single-image-based reconstruction (a) and monocular-video-based further optimization (b) processes.
Figure 3: Qualitative comparison results on single-image-based reconstruction.
Figure 4: Qualitative comparison with HeadNeRF across varying viewing angles. Our method works even at extreme angles.
Figure 5: Expression reconstruction results.
...and 4 more figures

Gaussian Deja-vu: Creating Controllable 3D Gaussian Head-Avatars with Enhanced Generalization and Personalization Abilities

TL;DR

Abstract

Gaussian Deja-vu: Creating Controllable 3D Gaussian Head-Avatars with Enhanced Generalization and Personalization Abilities

Authors

TL;DR

Abstract

Table of Contents

Figures (9)