Using Gaussian Splats to Create High-Fidelity Facial Geometry and Texture

Haodi He; Jihun Yu; Ronald Fedkiw

Using Gaussian Splats to Create High-Fidelity Facial Geometry and Texture

Haodi He, Jihun Yu, Ronald Fedkiw

TL;DR

The paper presents a pipeline to reconstruct high-fidelity facial geometry and textures from uncalibrated multi-view images using Gaussian Splatting, tightly coupling Gaussians to a triangulated mesh via soft geometric constraints and semantic segmentation. It introduces a texture-space neural texture approach that relights and decomposes texture from lighting using PCA-albedo priors, enabling de-lit textures from limited data without a light-stage. The method supports training on disparate captures and culminates in MetaHuman generation, offering an animatable, relightable asset compatible with standard graphics pipelines. Experiments compare with prior work showing improved geometry alignment and robust de-lighting, and demonstrate text-driven asset creation pipelines.

Abstract

We leverage increasingly popular three-dimensional neural representations in order to construct a unified and consistent explanation of a collection of uncalibrated images of the human face. Our approach utilizes Gaussian Splatting, since it is more explicit and thus more amenable to constraints than NeRFs. We leverage segmentation annotations to align the semantic regions of the face, facilitating the reconstruction of a neutral pose from only 11 images (as opposed to requiring a long video). We soft constrain the Gaussians to an underlying triangulated surface in order to provide a more structured Gaussian Splat reconstruction, which in turn informs subsequent perturbations to increase the accuracy of the underlying triangulated surface. The resulting triangulated surface can then be used in a standard graphics pipeline. In addition, and perhaps most impactful, we show how accurate geometry enables the Gaussian Splats to be transformed into texture space where they can be treated as a view-dependent neural texture. This allows one to use high visual fidelity Gaussian Splatting on any asset in a scene without the need to modify any other asset or any other aspect (geometry, lighting, renderer, etc.) of the graphics pipeline. We utilize a relightable Gaussian model to disentangle texture from lighting in order to obtain a delit high-resolution albedo texture that is also readily usable in a standard graphics pipeline. The flexibility of our system allows for training with disparate images, even with incompatible lighting, facilitating robust regularization. Finally, we demonstrate the efficacy of our approach by illustrating its use in a text-driven asset creation pipeline.

Using Gaussian Splats to Create High-Fidelity Facial Geometry and Texture

TL;DR

Abstract

Using Gaussian Splats to Create High-Fidelity Facial Geometry and Texture

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (22)