Latent Intrinsics Emerge from Training to Relight
Xiao Zhang, William Gao, Seemandhar Jain, Michael Maire, David A. Forsyth, Anand Bhattad
TL;DR
The paper tackles image relighting by learning latent intrinsic and extrinsic representations directly from data, avoiding explicit physical models. It proposes a fully data-driven autoencoder that encodes intrinsic scene properties $S^l_{s,i}$ and lighting $L^l_s$ from paired images and decodes to relit images, with a constrained-scaling fusion to prevent leakage. The key findings are that albedo-like maps emerge from latent intrinsics without supervision, relighting achieves state-of-the-art results on real scenes, and the model generalizes to zero-shot relighting and to StyleGAN-generated images. This approach reduces reliance on detailed geometry and surface models and offers a flexible, scalable pathway for relighting and intrinsic estimation in diverse scenes.
Abstract
Image relighting is the task of showing what a scene from a source image would look like if illuminated differently. Inverse graphics schemes recover an explicit representation of geometry and a set of chosen intrinsics, then relight with some form of renderer. However error control for inverse graphics is difficult, and inverse graphics methods can represent only the effects of the chosen intrinsics. This paper describes a relighting method that is entirely data-driven, where intrinsics and lighting are each represented as latent variables. Our approach produces SOTA relightings of real scenes, as measured by standard metrics. We show that albedo can be recovered from our latent intrinsics without using any example albedos, and that the albedos recovered are competitive with SOTA methods.
