Table of Contents
Fetching ...

From NeRFs to Gaussian Splats, and Back

Siming He, Zach Osman, Pratik Chaudhari

TL;DR

The paper tackles the challenge of balancing generalization across viewpoints with rendering speed for scene representations in robotics under sparse ego-centric views. It introduces NeRFGS, a bidirectional conversion between implicit NeRF-SH and explicit Gaussian splatting, enabling compact storage and easy edits with minimal retraining. NeRFGS delivers real-time rendering (>$40$ FPS) while retaining NeRF-like quality on novel views after modest fine-tuning (e.g., around $100$ iterations), with conversion costs around $10$ seconds on an RTX $4090$. Experiments on Aspen, Giannini Hall, Wissahickon, and Locust Walk show that NeRFGS often surpasses NeRF-SH and GS baselines in PSNR, SSIM, and LPIPS on dissimilar views and supports editing workflows such as removing a lamp post via GS editing.

Abstract

For robotics applications where there is a limited number of (typically ego-centric) views, parametric representations such as neural radiance fields (NeRFs) generalize better than non-parametric ones such as Gaussian splatting (GS) to views that are very different from those in the training data; GS however can render much faster than NeRFs. We develop a procedure to convert back and forth between the two. Our approach achieves the best of both NeRFs (superior PSNR, SSIM, and LPIPS on dissimilar views, and a compact representation) and GS (real-time rendering and ability for easily modifying the representation); the computational cost of these conversions is minor compared to training the two from scratch.

From NeRFs to Gaussian Splats, and Back

TL;DR

The paper tackles the challenge of balancing generalization across viewpoints with rendering speed for scene representations in robotics under sparse ego-centric views. It introduces NeRFGS, a bidirectional conversion between implicit NeRF-SH and explicit Gaussian splatting, enabling compact storage and easy edits with minimal retraining. NeRFGS delivers real-time rendering (> FPS) while retaining NeRF-like quality on novel views after modest fine-tuning (e.g., around iterations), with conversion costs around seconds on an RTX . Experiments on Aspen, Giannini Hall, Wissahickon, and Locust Walk show that NeRFGS often surpasses NeRF-SH and GS baselines in PSNR, SSIM, and LPIPS on dissimilar views and supports editing workflows such as removing a lamp post via GS editing.

Abstract

For robotics applications where there is a limited number of (typically ego-centric) views, parametric representations such as neural radiance fields (NeRFs) generalize better than non-parametric ones such as Gaussian splatting (GS) to views that are very different from those in the training data; GS however can render much faster than NeRFs. We develop a procedure to convert back and forth between the two. Our approach achieves the best of both NeRFs (superior PSNR, SSIM, and LPIPS on dissimilar views, and a compact representation) and GS (real-time rendering and ability for easily modifying the representation); the computational cost of these conversions is minor compared to training the two from scratch.
Paper Structure (3 sections, 3 figures, 2 tables)

This paper contains 3 sections, 3 figures, 2 tables.

Figures (3)

  • Figure 1: NeRF generalizes better than Gaussian splatting (GS) to views that are very different from those in the training data. NeRF-SH (Left) and Splatfacto (Right), both summarized in \ref{['tab:models']}, are trained on Aspen (Top) and Giannini Hall (Bottom) from Nerfstudio dataset nerfstudio. In both datasets, NeRF-SH and Splatfacto have relatively good training and validation PSNRs to one another because the validation views are similar to the training views (see \ref{['fig:newdataset']} Top). For novel views which differ more from the training views, like the images shown above, NeRF-SH renders better RGB and depth images than Splatfacto. The red boxes in the Aspen novel view illustrate areas in the RGB and depth views where NeRF-SH has noticeably better depth geometry and fewer artifacts than Splatfacto. For the Giannini Hall novel view, NeRF-SH clearly preserves the depth structure better than Splatfacto.
  • Figure 3: NeRFGS generalizes better than GS while having real-time rendering. NeRFGS converts trained NeRF-SH into GS while maintaining good generalization in contrast to Splatfacto in \ref{['fig:result_new_1']}. The conversion takes about 10 sec on GeForce RTX 4090: 7 sec for extracting spherical harmonics and 3 sec for fine-tuning. It is therefore fast enough to be done periodically on a robot. If necessary, this time can be reduced if NERF-SH to GS conversion is done only around the robot; or if the sky (not relevant for many ground robotics tasks) is ignored.
  • Figure 5: NeRFs can be efficiently converted to high-quality Gaussian splats. We report the PSNR, SSIM and LPIPS on validation data as a function of training progress for Aspen. After 1000 iterations of fine-tuning, NeRFGS performs comparably or better than NeRF-SH and Splatfacto.