Table of Contents
Fetching ...

Splatfacto-W: A Nerfstudio Implementation of Gaussian Splatting for Unconstrained Photo Collections

Congrong Xu, Justin Kerr, Angjoo Kanazawa

TL;DR

This paper introduces Splatfacto-W, an approach that integrates per-Gaussian neural color features and per-image appearance embeddings into the rasterization process, along with a spherical harmonics-based background model to represent varying photometric appearances and better depict backgrounds.

Abstract

Novel view synthesis from unconstrained in-the-wild image collections remains a significant yet challenging task due to photometric variations and transient occluders that complicate accurate scene reconstruction. Previous methods have approached these issues by integrating per-image appearance features embeddings in Neural Radiance Fields (NeRFs). Although 3D Gaussian Splatting (3DGS) offers faster training and real-time rendering, adapting it for unconstrained image collections is non-trivial due to the substantially different architecture. In this paper, we introduce Splatfacto-W, an approach that integrates per-Gaussian neural color features and per-image appearance embeddings into the rasterization process, along with a spherical harmonics-based background model to represent varying photometric appearances and better depict backgrounds. Our key contributions include latent appearance modeling, efficient transient object handling, and precise background modeling. Splatfacto-W delivers high-quality, real-time novel view synthesis with improved scene consistency in in-the-wild scenarios. Our method improves the Peak Signal-to-Noise Ratio (PSNR) by an average of 5.3 dB compared to 3DGS, enhances training speed by 150 times compared to NeRF-based methods, and achieves a similar rendering speed to 3DGS. Additional video results and code integrated into Nerfstudio are available at https://kevinxu02.github.io/splatfactow/.

Splatfacto-W: A Nerfstudio Implementation of Gaussian Splatting for Unconstrained Photo Collections

TL;DR

This paper introduces Splatfacto-W, an approach that integrates per-Gaussian neural color features and per-image appearance embeddings into the rasterization process, along with a spherical harmonics-based background model to represent varying photometric appearances and better depict backgrounds.

Abstract

Novel view synthesis from unconstrained in-the-wild image collections remains a significant yet challenging task due to photometric variations and transient occluders that complicate accurate scene reconstruction. Previous methods have approached these issues by integrating per-image appearance features embeddings in Neural Radiance Fields (NeRFs). Although 3D Gaussian Splatting (3DGS) offers faster training and real-time rendering, adapting it for unconstrained image collections is non-trivial due to the substantially different architecture. In this paper, we introduce Splatfacto-W, an approach that integrates per-Gaussian neural color features and per-image appearance embeddings into the rasterization process, along with a spherical harmonics-based background model to represent varying photometric appearances and better depict backgrounds. Our key contributions include latent appearance modeling, efficient transient object handling, and precise background modeling. Splatfacto-W delivers high-quality, real-time novel view synthesis with improved scene consistency in in-the-wild scenarios. Our method improves the Peak Signal-to-Noise Ratio (PSNR) by an average of 5.3 dB compared to 3DGS, enhances training speed by 150 times compared to NeRF-based methods, and achieves a similar rendering speed to 3DGS. Additional video results and code integrated into Nerfstudio are available at https://kevinxu02.github.io/splatfactow/.
Paper Structure (15 sections, 18 equations, 4 figures, 1 table)

This paper contains 15 sections, 18 equations, 4 figures, 1 table.

Figures (4)

  • Figure 1: Splatfacto-W: Real-time exploration of in-the-wild images. Our approach enables real-time appearance change in the nerfstudio viewer. For example, one can click each input image, and explore the scene using that appearance conditioning, and seamlessly move on to another by clicking an input view. Please see the video on our https://kevinxu02.github.io/splatfactow/.
  • Figure 2: We begin by predicting the color of each Gaussian using the Appearance Model. These Gaussians are then rasterized to generate the foreground objects. While Background Model predicts the background given ray directions. The foreground and background are merged using alpha blending to produce the final image. This final image is compared with the masked ground truth image and then processed through the Robust Mask to update the model parameters.
  • Figure 3: Eval Results for Trevi Fountain, Brandenburg Gate, and Sacre Coeur (Left: Ground Truth; Right: Splatfacto-W)
  • Figure 4: Background Modeling in Splatfacto (Left: Without Background Model; Right: With Background Model)