Table of Contents
Fetching ...

DeRF: Decomposed Radiance Fields

Daniel Rebain, Wei Jiang, Soroosh Yazdani, Ke Li, Kwang Moo Yi, Andrea Tagliasacchi

TL;DR

DeRF tackles slow NeRF rendering by decomposing a scene into spatially localized neural heads guided by differentiable Voronoi partitions and compositing with the Painter's Algorithm. This mitigates diminishing returns from simply enlarging a single network by enabling per-region processing with GPU-friendly memory access. Empirical results show DeRF can achieve up to 3x efficiency gains or up to 1.0 dB PSNR improvements at the same cost, with qualitative gains in detail. The approach is supported by a training regime that stabilizes the learned decomposition and a theoretical guarantee of painter-friendly correctness for Voronoi partitions.

Abstract

With the advent of Neural Radiance Fields (NeRF), neural networks can now render novel views of a 3D scene with quality that fools the human eye. Yet, generating these images is very computationally intensive, limiting their applicability in practical scenarios. In this paper, we propose a technique based on spatial decomposition capable of mitigating this issue. Our key observation is that there are diminishing returns in employing larger (deeper and/or wider) networks. Hence, we propose to spatially decompose a scene and dedicate smaller networks for each decomposed part. When working together, these networks can render the whole scene. This allows us near-constant inference time regardless of the number of decomposed parts. Moreover, we show that a Voronoi spatial decomposition is preferable for this purpose, as it is provably compatible with the Painter's Algorithm for efficient and GPU-friendly rendering. Our experiments show that for real-world scenes, our method provides up to 3x more efficient inference than NeRF (with the same rendering quality), or an improvement of up to 1.0~dB in PSNR (for the same inference cost).

DeRF: Decomposed Radiance Fields

TL;DR

DeRF tackles slow NeRF rendering by decomposing a scene into spatially localized neural heads guided by differentiable Voronoi partitions and compositing with the Painter's Algorithm. This mitigates diminishing returns from simply enlarging a single network by enabling per-region processing with GPU-friendly memory access. Empirical results show DeRF can achieve up to 3x efficiency gains or up to 1.0 dB PSNR improvements at the same cost, with qualitative gains in detail. The approach is supported by a training regime that stabilizes the learned decomposition and a theoretical guarantee of painter-friendly correctness for Voronoi partitions.

Abstract

With the advent of Neural Radiance Fields (NeRF), neural networks can now render novel views of a 3D scene with quality that fools the human eye. Yet, generating these images is very computationally intensive, limiting their applicability in practical scenarios. In this paper, we propose a technique based on spatial decomposition capable of mitigating this issue. Our key observation is that there are diminishing returns in employing larger (deeper and/or wider) networks. Hence, we propose to spatially decompose a scene and dedicate smaller networks for each decomposed part. When working together, these networks can render the whole scene. This allows us near-constant inference time regardless of the number of decomposed parts. Moreover, we show that a Voronoi spatial decomposition is preferable for this purpose, as it is provably compatible with the Painter's Algorithm for efficient and GPU-friendly rendering. Our experiments show that for real-world scenes, our method provides up to 3x more efficient inference than NeRF (with the same rendering quality), or an improvement of up to 1.0~dB in PSNR (for the same inference cost).

Paper Structure

This paper contains 34 sections, 9 equations, 18 figures, 4 tables.

Figures (18)

  • Figure 1: We render a scene (a) from a decomposed neural representation (b), consisting of a collection of spatially localized neural networks. Each of these networks render a convex portion of the image, and these are then composited into the output via the Painter's Algorithm. Depending on the level of decomposition, this can lead to faster rendering, or to renderings that have the same runtime, but contain sharper details. In (c), we decompose the scene into 16 parts, which leads to sharper details than (d), with similar runtime.
  • Figure 2: Diminishing returns -- We sweep through network architectures varying by depth and width to show how the gains in quality diminish with increased capacity. The total number of network parameters varies linearly with network depth (left) and quadratically with the number of units in each layer (right). All networks trained for 300k iterations on the NeRF "room" scene.
  • Figure 3: Framework -- The DeRF architecture (left) consists of a set of independent NeRF (right) networks which are each responsible for the region of space within a Voronoi cell defined by the decomposition parameters $\phi$. The final color value for a ray is computed by applying the volume rendering equation to each segment of radiance $\mathbf{c}$ and density $\sigma$, and alpha compositing together the resulting colors.
  • Figure 4: Decomposed radiance fields -- We visualize each of the rendering heads individually. Note that as each head is rendered only the weights of one neural network head needs to be loaded, hence resulting in optimal cache coherency while accessing GPU memory.
  • Figure 5: Compositing with the Painter's algorithm -- Visualization of the intermediate steps of the compositing process.
  • ...and 13 more figures