Bringing NeRFs to the Latent Space: Inverse Graphics Autoencoder
Antoine Schnepf, Karim Kassab, Jean-Yves Franceschi, Laurent Caraffa, Flavian Vasile, Jeremie Mary, Andrew Comport, Valerie Gouet-Brunet
TL;DR
This work tackles the challenge of enabling NeRFs to operate directly in the latent spaces of image autoencoders by introducing IG-AE, a 3D-aware latent space regularized with synthetic 3D geometry. It presents a two-stage latent NeRF training pipeline (Latent Supervision and RGB Alignment) and couples it with a 3D-regularized autoencoder that preserves reconstruction quality on both synthetic and real data. The approach uses Tri-Planes to model 3D scenes and enforces 3D-consistency in latent space while aligning decoded renderings with RGB views, yielding improved latent NeRF quality over standard AEs and faster training/rendering than RGB-space NeRFs. An open-source Nerfstudio extension enables researchers to train various NeRF models in the latent space, promoting broader exploration of latent NeRFs and 3D-aware representations with practical speedups and interoperability benefits.
Abstract
While pre-trained image autoencoders are increasingly utilized in computer vision, the application of inverse graphics in 2D latent spaces has been under-explored. Yet, besides reducing the training and rendering complexity, applying inverse graphics in the latent space enables a valuable interoperability with other latent-based 2D methods. The major challenge is that inverse graphics cannot be directly applied to such image latent spaces because they lack an underlying 3D geometry. In this paper, we propose an Inverse Graphics Autoencoder (IG-AE) that specifically addresses this issue. To this end, we regularize an image autoencoder with 3D-geometry by aligning its latent space with jointly trained latent 3D scenes. We utilize the trained IG-AE to bring NeRFs to the latent space with a latent NeRF training pipeline, which we implement in an open-source extension of the Nerfstudio framework, thereby unlocking latent scene learning for its supported methods. We experimentally confirm that Latent NeRFs trained with IG-AE present an improved quality compared to a standard autoencoder, all while exhibiting training and rendering accelerations with respect to NeRFs trained in the image space. Our project page can be found at https://ig-ae.github.io .
