Table of Contents
Fetching ...

PlenOctrees for Real-time Rendering of Neural Radiance Fields

Alex Yu, Ruilong Li, Matthew Tancik, Hao Li, Ren Ng, Angjoo Kanazawa

TL;DR

This work tackles the slow rendering of Neural Radiance Fields by distilling NeRFs into PlenOctrees, an octree-based, view-dependent representation. By training a NeRF variant that outputs spherical-harmonic coefficients (NeRF-SH) and pre-tabulating the results into a PlenOctree, the authors achieve real-time rendering (over 150 FPS for 800×800 images) with quality comparable to or better than NeRF, and enable in-browser visualization. The method further accelerates training by allowing early stopping of NeRF-SH training and subsequent octree fine-tuning, and provides an interactive desktop and WebGL-based browser demonstration. Together, these contributions enable photorealistic, real-time 6-DOF visualization of complex scenes, with broad implications for AR/VR, product visualization, and interactive online experiences.

Abstract

We introduce a method to render Neural Radiance Fields (NeRFs) in real time using PlenOctrees, an octree-based 3D representation which supports view-dependent effects. Our method can render 800x800 images at more than 150 FPS, which is over 3000 times faster than conventional NeRFs. We do so without sacrificing quality while preserving the ability of NeRFs to perform free-viewpoint rendering of scenes with arbitrary geometry and view-dependent effects. Real-time performance is achieved by pre-tabulating the NeRF into a PlenOctree. In order to preserve view-dependent effects such as specularities, we factorize the appearance via closed-form spherical basis functions. Specifically, we show that it is possible to train NeRFs to predict a spherical harmonic representation of radiance, removing the viewing direction as an input to the neural network. Furthermore, we show that PlenOctrees can be directly optimized to further minimize the reconstruction loss, which leads to equal or better quality compared to competing methods. Moreover, this octree optimization step can be used to reduce the training time, as we no longer need to wait for the NeRF training to converge fully. Our real-time neural rendering approach may potentially enable new applications such as 6-DOF industrial and product visualizations, as well as next generation AR/VR systems. PlenOctrees are amenable to in-browser rendering as well; please visit the project page for the interactive online demo, as well as video and code: https://alexyu.net/plenoctrees

PlenOctrees for Real-time Rendering of Neural Radiance Fields

TL;DR

This work tackles the slow rendering of Neural Radiance Fields by distilling NeRFs into PlenOctrees, an octree-based, view-dependent representation. By training a NeRF variant that outputs spherical-harmonic coefficients (NeRF-SH) and pre-tabulating the results into a PlenOctree, the authors achieve real-time rendering (over 150 FPS for 800×800 images) with quality comparable to or better than NeRF, and enable in-browser visualization. The method further accelerates training by allowing early stopping of NeRF-SH training and subsequent octree fine-tuning, and provides an interactive desktop and WebGL-based browser demonstration. Together, these contributions enable photorealistic, real-time 6-DOF visualization of complex scenes, with broad implications for AR/VR, product visualization, and interactive online experiences.

Abstract

We introduce a method to render Neural Radiance Fields (NeRFs) in real time using PlenOctrees, an octree-based 3D representation which supports view-dependent effects. Our method can render 800x800 images at more than 150 FPS, which is over 3000 times faster than conventional NeRFs. We do so without sacrificing quality while preserving the ability of NeRFs to perform free-viewpoint rendering of scenes with arbitrary geometry and view-dependent effects. Real-time performance is achieved by pre-tabulating the NeRF into a PlenOctree. In order to preserve view-dependent effects such as specularities, we factorize the appearance via closed-form spherical basis functions. Specifically, we show that it is possible to train NeRFs to predict a spherical harmonic representation of radiance, removing the viewing direction as an input to the neural network. Furthermore, we show that PlenOctrees can be directly optimized to further minimize the reconstruction loss, which leads to equal or better quality compared to competing methods. Moreover, this octree optimization step can be used to reduce the training time, as we no longer need to wait for the NeRF training to converge fully. Our real-time neural rendering approach may potentially enable new applications such as 6-DOF industrial and product visualizations, as well as next generation AR/VR systems. PlenOctrees are amenable to in-browser rendering as well; please visit the project page for the interactive online demo, as well as video and code: https://alexyu.net/plenoctrees

Paper Structure

This paper contains 35 sections, 16 equations, 12 figures, 8 tables.

Figures (12)

  • Figure 1: Real-time NeRF with PlenOctrees. Given a set of posed images of a scene, our method creates a 3D volumetric model that can be rendered in real-time. We propose PlenOctrees, which are octrees that can capture view-dependent dependent effects such as specularities. Rendering using our approach is orders of magnitude faster than NeRF.
  • Figure 2: Method Overview. We propose a method to quickly render NeRFs by training a modified NeRF model (NeRF-SH) and converting it into a PlenOctree, an octree that captures view-dependent effects. a) The NeRF-SH model uses the same optimization procedure and volume rendering method presented in NeRF mildenhall2020. However, instead of predicting the RGB color $c$ directly, the network predicts spherical harmonic coefficients $\mathbf{k}$. The color $c$ is calculated by summing the weighted spherical harmonic bases evaluated at the corresponding ray direction $(\theta,\phi)$. The spherical harmonics enable the representation to model view-dependent appearance. The values in the orange boxes are used for volume rendering. b) To build a PlenOctree, we densely sample the NeRF-SH model in the volume around the target object and tabulate the density and SH coefficients. We can further optimize the PlenOctree directly with the training images to improve its quality.
  • Figure 3: Sparsity loss and conversion robustness. When trained without the sparsity loss, NeRF can often converge to a solution where unobserved portions or the background are solid. This degrades the spatial resolution of our octree-based representation.
  • Figure 4: NeRF-synthetic qualitative results. Randomly sampled qualitative comparisons on a reimplementation of NeRF and our proposed method. We are unable to find any significant image quality difference between the two methods. Despite this, our method can render these examples images more than 3500x faster.
  • Figure 5: Quality vs. speed comparison of various methods. A comparison of methods on the NeRF-synthetic dataset where higher PSNR and lower FPS (top right) is most desirable. We include four variants of the PlenOctree model that tune parts of the conversion process to trade off accuracy for speed. Please see Table \ref{['tab:ablation_orf']} (adjacent) for descriptions of these variants. Note that the time axis is logarithmic.
  • ...and 7 more figures