Table of Contents
Fetching ...

Analyzing the Internals of Neural Radiance Fields

Lukas Radl, Andreas Kurz, Michael Steiner, Markus Steinberger

TL;DR

This work finds that trained NeRFs, Mip-NeRFs and proposal network samplers map samples with high density to local minima along a ray in activation feature space and shows how these large MLPs can be accelerated by transforming intermediate activations to a weight estimate, without any modifications to the training protocol or the network architecture.

Abstract

Modern Neural Radiance Fields (NeRFs) learn a mapping from position to volumetric density leveraging proposal network samplers. In contrast to the coarse-to-fine sampling approach with two NeRFs, this offers significant potential for acceleration using lower network capacity. Given that NeRFs utilize most of their network capacity to estimate radiance, they could store valuable density information in their parameters or their deep features. To investigate this proposition, we take one step back and analyze large, trained ReLU-MLPs used in coarse-to-fine sampling. Building on our novel activation visualization method, we find that trained NeRFs, Mip-NeRFs and proposal network samplers map samples with high density to local minima along a ray in activation feature space. We show how these large MLPs can be accelerated by transforming intermediate activations to a weight estimate, without any modifications to the training protocol or the network architecture. With our approach, we can reduce the computational requirements of trained NeRFs by up to 50% with only a slight hit in rendering quality. Extensive experimental evaluation on a variety of datasets and architectures demonstrates the effectiveness of our approach. Consequently, our methodology provides valuable insight into the inner workings of NeRFs.

Analyzing the Internals of Neural Radiance Fields

TL;DR

This work finds that trained NeRFs, Mip-NeRFs and proposal network samplers map samples with high density to local minima along a ray in activation feature space and shows how these large MLPs can be accelerated by transforming intermediate activations to a weight estimate, without any modifications to the training protocol or the network architecture.

Abstract

Modern Neural Radiance Fields (NeRFs) learn a mapping from position to volumetric density leveraging proposal network samplers. In contrast to the coarse-to-fine sampling approach with two NeRFs, this offers significant potential for acceleration using lower network capacity. Given that NeRFs utilize most of their network capacity to estimate radiance, they could store valuable density information in their parameters or their deep features. To investigate this proposition, we take one step back and analyze large, trained ReLU-MLPs used in coarse-to-fine sampling. Building on our novel activation visualization method, we find that trained NeRFs, Mip-NeRFs and proposal network samplers map samples with high density to local minima along a ray in activation feature space. We show how these large MLPs can be accelerated by transforming intermediate activations to a weight estimate, without any modifications to the training protocol or the network architecture. With our approach, we can reduce the computational requirements of trained NeRFs by up to 50% with only a slight hit in rendering quality. Extensive experimental evaluation on a variety of datasets and architectures demonstrates the effectiveness of our approach. Consequently, our methodology provides valuable insight into the inner workings of NeRFs.
Paper Structure (33 sections, 10 equations, 8 figures, 12 tables)

This paper contains 33 sections, 10 equations, 8 figures, 12 tables.

Figures (8)

  • Figure 1: We show how density estimates derived from intermediate activations can accelerate inference for pre-trained Neural Radiance Fields by effectively reducing the capacity of large MLPs. Here, we show a small toy example for a ray with seven, uniformly placed samples between $t_n$ and $t_f$. We obtain an activation feature vector ${\mathbf{f}}_{\ell}$ for layer $\ell$, apply a function to obtain a density estimate $\hat{{\mathbf{d}}}$, leveraging the observation that minima in activation feature space indicate samples with high density $\sigma$. Finally, we perform inverse transform sampling with our weight estimate $\hat{{\mathbf{w}}}$.
  • Figure 2: Ground-truth images and their corresponding normalized coarse and fine activations $v_{\ell}$ using the magma colormap reveal an interesting relationship between activations and outputs. With our visualization approach, we can infer some scene content using only $v_{\ell}$. For each scene, we visualize activations for different layers $\ell$.
  • Figure 3: Intermediate activations allow for simple performance improvements. We can reduce the inference time of NeRFs in synthetic scenes if we perform the fine pass only if the condition $v_{\ell} < \mu({\mathbf{V}}_\ell)$ is met.
  • Figure 4: Visualization of our proposed approach for approximate density extraction for a real-world example: We visualize activation features ${\mathbf{f}}_{\ell}$ and densities $\sigma$ for $128$ uniform samples along an example ray, for a Mip-NeRF trained on the chair scene. Using Eqn. \ref{['eq:std']}, a plausible density estimate $\hat{\mathbf{d}}$ is extracted from a activation feature ${\mathbf{f}}_{\ell}$.
  • Figure 5: Qualitative results for our approach (top row) compared to baseline methods (bottom row) for synthetic and real-world scenes. As can be seen in the zoomed-in views, our best renderings are virtually indistinguishable from the baseline in most configurations.
  • ...and 3 more figures