Table of Contents
Fetching ...

BecomingLit: Relightable Gaussian Avatars with Hybrid Neural Shading

Jonathan Schmidt, Simon Giebenhain, Matthias Niessner

TL;DR

BecomingLit addresses the challenge of producing photorealistic, relightable head avatars from economical multi-view light-stage data. It introduces a FLAME-backed geometry framework with 3D Gaussian primitives and a hybrid neural shading approach that couples a neural diffuse BRDF with an analytical Cook-Torrance specular term, enabling all-frequency relighting and monocular video-driven animation. The authors contribute a high-resolution OLAT facial dataset and demonstrate substantial improvements over state-of-the-art baselines in relighting and self-reenactment, while maintaining real-time-like rendering performance on consumer GPUs. This work lowers the capture and computation barriers for realistic facial avatars, enabling broader research and practical applications in virtual reality and related fields.

Abstract

We introduce BecomingLit, a novel method for reconstructing relightable, high-resolution head avatars that can be rendered from novel viewpoints at interactive rates. Therefore, we propose a new low-cost light stage capture setup, tailored specifically towards capturing faces. Using this setup, we collect a novel dataset consisting of diverse multi-view sequences of numerous subjects under varying illumination conditions and facial expressions. By leveraging our new dataset, we introduce a new relightable avatar representation based on 3D Gaussian primitives that we animate with a parametric head model and an expression-dependent dynamics module. We propose a new hybrid neural shading approach, combining a neural diffuse BRDF with an analytical specular term. Our method reconstructs disentangled materials from our dynamic light stage recordings and enables all-frequency relighting of our avatars with both point lights and environment maps. In addition, our avatars can easily be animated and controlled from monocular videos. We validate our approach in extensive experiments on our dataset, where we consistently outperform existing state-of-the-art methods in relighting and reenactment by a significant margin.

BecomingLit: Relightable Gaussian Avatars with Hybrid Neural Shading

TL;DR

BecomingLit addresses the challenge of producing photorealistic, relightable head avatars from economical multi-view light-stage data. It introduces a FLAME-backed geometry framework with 3D Gaussian primitives and a hybrid neural shading approach that couples a neural diffuse BRDF with an analytical Cook-Torrance specular term, enabling all-frequency relighting and monocular video-driven animation. The authors contribute a high-resolution OLAT facial dataset and demonstrate substantial improvements over state-of-the-art baselines in relighting and self-reenactment, while maintaining real-time-like rendering performance on consumer GPUs. This work lowers the capture and computation barriers for realistic facial avatars, enabling broader research and practical applications in virtual reality and related fields.

Abstract

We introduce BecomingLit, a novel method for reconstructing relightable, high-resolution head avatars that can be rendered from novel viewpoints at interactive rates. Therefore, we propose a new low-cost light stage capture setup, tailored specifically towards capturing faces. Using this setup, we collect a novel dataset consisting of diverse multi-view sequences of numerous subjects under varying illumination conditions and facial expressions. By leveraging our new dataset, we introduce a new relightable avatar representation based on 3D Gaussian primitives that we animate with a parametric head model and an expression-dependent dynamics module. We propose a new hybrid neural shading approach, combining a neural diffuse BRDF with an analytical specular term. Our method reconstructs disentangled materials from our dynamic light stage recordings and enables all-frequency relighting of our avatars with both point lights and environment maps. In addition, our avatars can easily be animated and controlled from monocular videos. We validate our approach in extensive experiments on our dataset, where we consistently outperform existing state-of-the-art methods in relighting and reenactment by a significant margin.

Paper Structure

This paper contains 25 sections, 9 equations, 8 figures, 6 tables.

Figures (8)

  • Figure 1: BecomingLit: Our approach effectively reconstructs detailed human head avatars that can be animated from videos and relighted in real-time using our hybrid neural shading approach. Besides our method, we introduce a new high-quality, multi-view OLAT dataset of faces.
  • Figure 2: OLAT Dataset: (a) Our custom light-stage rig we used to capture (b) our dataset consisting of high-resolution, high frame rate, multi-view recordings of faces under both OLAT and fully-lit conditions.
  • Figure 3: Method Overview: Given estimated FLAME coefficients, we obtain posed 3D Gaussian primitives with our expression-dependent dynamics module $\mathcal{F}_g$. To render photorealistic appearance, we combine the neural diffuse BRDF $\mathcal{F}_d$ with an analytical specular shading term. The parameters for the specular shading are predicted by the view-dependent $\mathcal{F}_v$ network. The avatar is optimized from light stage sequences using a photometric loss term.
  • Figure 4: Relighting and Self-Reenactment: Qualitative comparison on held-out segments and held-out illuminations.
  • Figure 5: Comparison of Intrinsic Decomposition: We compare the recovered albedo (a) and normals (b), as well as the diffuse (c) and specular (d) contributions on a training frame that sum up to the final rendering (e). Note that the reference image (f) is identical in all rows.
  • ...and 3 more figures