BecomingLit: Relightable Gaussian Avatars with Hybrid Neural Shading
Jonathan Schmidt, Simon Giebenhain, Matthias Niessner
TL;DR
BecomingLit addresses the challenge of producing photorealistic, relightable head avatars from economical multi-view light-stage data. It introduces a FLAME-backed geometry framework with 3D Gaussian primitives and a hybrid neural shading approach that couples a neural diffuse BRDF with an analytical Cook-Torrance specular term, enabling all-frequency relighting and monocular video-driven animation. The authors contribute a high-resolution OLAT facial dataset and demonstrate substantial improvements over state-of-the-art baselines in relighting and self-reenactment, while maintaining real-time-like rendering performance on consumer GPUs. This work lowers the capture and computation barriers for realistic facial avatars, enabling broader research and practical applications in virtual reality and related fields.
Abstract
We introduce BecomingLit, a novel method for reconstructing relightable, high-resolution head avatars that can be rendered from novel viewpoints at interactive rates. Therefore, we propose a new low-cost light stage capture setup, tailored specifically towards capturing faces. Using this setup, we collect a novel dataset consisting of diverse multi-view sequences of numerous subjects under varying illumination conditions and facial expressions. By leveraging our new dataset, we introduce a new relightable avatar representation based on 3D Gaussian primitives that we animate with a parametric head model and an expression-dependent dynamics module. We propose a new hybrid neural shading approach, combining a neural diffuse BRDF with an analytical specular term. Our method reconstructs disentangled materials from our dynamic light stage recordings and enables all-frequency relighting of our avatars with both point lights and environment maps. In addition, our avatars can easily be animated and controlled from monocular videos. We validate our approach in extensive experiments on our dataset, where we consistently outperform existing state-of-the-art methods in relighting and reenactment by a significant margin.
