Table of Contents
Fetching ...

RadSplat: Radiance Field-Informed Gaussian Splatting for Robust Real-Time Rendering with 900+ FPS

Michael Niemeyer, Fabian Manhardt, Marie-Julie Rakotosaona, Michael Oechsle, Daniel Duckworth, Rama Gosula, Keisuke Tateno, John Bates, Dominik Kaeser, Federico Tombari

TL;DR

The paper addresses real-time, high-quality view synthesis for complex real-world scenes, where volumetric NeRF methods are too slow and pure Gaussian Splatting struggles with optimization brittleness. RadSplat blends a radiance-field prior (Zip-NeRF with GLO embeddings) with a point-based Gaussian Splatting representation, introducing a pruning strategy and a test-time visibility filtering to scale to house-sized scenes. Key contributions include (i) radiance-field-based initialization and supervision, (ii) a ray-contribution-based pruning that reduces Gaussian counts by up to 10x, and (iii) viewpoint clustering and visibility-based filtering for faster rendering. On Mip-NeRF360 and Zip-NeRF benchmarks, RadSplat achieves state-of-the-art synthesis quality and renders at up to 900+ FPS, surpassing prior real-time and non-real-time methods in both speed and fidelity.

Abstract

Recent advances in view synthesis and real-time rendering have achieved photorealistic quality at impressive rendering speeds. While Radiance Field-based methods achieve state-of-the-art quality in challenging scenarios such as in-the-wild captures and large-scale scenes, they often suffer from excessively high compute requirements linked to volumetric rendering. Gaussian Splatting-based methods, on the other hand, rely on rasterization and naturally achieve real-time rendering but suffer from brittle optimization heuristics that underperform on more challenging scenes. In this work, we present RadSplat, a lightweight method for robust real-time rendering of complex scenes. Our main contributions are threefold. First, we use radiance fields as a prior and supervision signal for optimizing point-based scene representations, leading to improved quality and more robust optimization. Next, we develop a novel pruning technique reducing the overall point count while maintaining high quality, leading to smaller and more compact scene representations with faster inference speeds. Finally, we propose a novel test-time filtering approach that further accelerates rendering and allows to scale to larger, house-sized scenes. We find that our method enables state-of-the-art synthesis of complex captures at 900+ FPS.

RadSplat: Radiance Field-Informed Gaussian Splatting for Robust Real-Time Rendering with 900+ FPS

TL;DR

The paper addresses real-time, high-quality view synthesis for complex real-world scenes, where volumetric NeRF methods are too slow and pure Gaussian Splatting struggles with optimization brittleness. RadSplat blends a radiance-field prior (Zip-NeRF with GLO embeddings) with a point-based Gaussian Splatting representation, introducing a pruning strategy and a test-time visibility filtering to scale to house-sized scenes. Key contributions include (i) radiance-field-based initialization and supervision, (ii) a ray-contribution-based pruning that reduces Gaussian counts by up to 10x, and (iii) viewpoint clustering and visibility-based filtering for faster rendering. On Mip-NeRF360 and Zip-NeRF benchmarks, RadSplat achieves state-of-the-art synthesis quality and renders at up to 900+ FPS, surpassing prior real-time and non-real-time methods in both speed and fidelity.

Abstract

Recent advances in view synthesis and real-time rendering have achieved photorealistic quality at impressive rendering speeds. While Radiance Field-based methods achieve state-of-the-art quality in challenging scenarios such as in-the-wild captures and large-scale scenes, they often suffer from excessively high compute requirements linked to volumetric rendering. Gaussian Splatting-based methods, on the other hand, rely on rasterization and naturally achieve real-time rendering but suffer from brittle optimization heuristics that underperform on more challenging scenes. In this work, we present RadSplat, a lightweight method for robust real-time rendering of complex scenes. Our main contributions are threefold. First, we use radiance fields as a prior and supervision signal for optimizing point-based scene representations, leading to improved quality and more robust optimization. Next, we develop a novel pruning technique reducing the overall point count while maintaining high quality, leading to smaller and more compact scene representations with faster inference speeds. Finally, we propose a novel test-time filtering approach that further accelerates rendering and allows to scale to larger, house-sized scenes. We find that our method enables state-of-the-art synthesis of complex captures at 900+ FPS.
Paper Structure (12 sections, 12 equations, 7 figures, 2 tables)

This paper contains 12 sections, 12 equations, 7 figures, 2 tables.

Figures (7)

  • Figure 1: RadSplat. By combining benefits of neural fields and point-based representations, we achieve state-of-the-art quality in view synthesis on mip-NeRF 360 barron2022mip while rendering at 900+ frames per second (FPS), indicating a speed up of 3.6$\times$ over 3D Gaussian Splatting (3DGS) kerbl20233d and 3,000$\times$ over Zip-NeRF Barron2023zipnerf.
  • Figure 2: Robust View Synthesis. On complex scenes with lighting variations, 3D Gaussian Splatting (3DGS) kerbl20233d degrades (\ref{['subfig:robust-a']}). When equipped with exposure handling modules kerbl20233dduckworth2023smerf, results improve but they still contain artifacts and are overly smooth (\ref{['subfig:robust-b']}). In contrast, we achieve high quality even for challenging, large-scale captures (\ref{['subfig:robust-c']}) by integrating a robust radiance field as prior.
  • Figure 3: Overview. 1. Given posed input images of a scene, we train a robust neural radiance field with GLO embeddings $\mathbf{l}_i$. 2. We use the radiance field prior to initialize and supervise our point-based 3DGS representation that we optimize with a novel pruning technique for more compact, high-quality scenes. 3. We perform viewpoint-based visibility filtering to further accelerate test-time rendering speed.
  • Figure 4: Qualitative Comparison. We show results on Bicycle and Kitchen from barron2022mip and on Berlin, NYC, London from Barron2023zipnerf. Compared to Zip-NeRF, our method better captures high-frequency textures (e.g., tablecloth in Kitchen) and geometric details (e.g., bicycle spokes in Berlin). Compared to 3DGS, we obtain sharper (e.g., shiny surfaces in London) and more stable results (e.g., color shift in Kitchen).
  • Figure 5: Ablation Study. Without (w/o) the NeRF initialization, geometric and texture details might get lost (\ref{['subfig:ablation1']}). Without the NeRF supervision, floating artifacts appear if the views exhibit lighting or exposure changes (\ref{['subfig:ablation2']}). W/o pruning, the number of Gaussians is 1.5$\times$ larger without any quality improvements (\ref{['subfig:ablation3']}).
  • ...and 2 more figures