Table of Contents
Fetching ...

KiloNeRF: Speeding up Neural Radiance Fields with Thousands of Tiny MLPs

Christian Reiser, Songyou Peng, Yiyi Liao, Andreas Geiger

TL;DR

KiloNeRF tackles NeRF's rendering bottleneck by decomposing a scene into a regular grid of thousands of tiny MLPs, each responsible for a small spatial cell. A three-stage training pipeline—teacher NeRF pretraining, distillation to initialize the tiny networks, and final fine-tuning—preserves visual fidelity while enabling dramatic speedups. By combining empty-space skipping, early ray termination, and specialized GPU-accelerated evaluation, the method delivers orders-of-magnitude faster renderings with comparable quality and modest storage requirements. The work demonstrates a practical pathway toward real-time neural radiance field rendering on consumer hardware and suggests avenues for further integration with other fast-NVS techniques.

Abstract

NeRF synthesizes novel views of a scene with unprecedented quality by fitting a neural radiance field to RGB images. However, NeRF requires querying a deep Multi-Layer Perceptron (MLP) millions of times, leading to slow rendering times, even on modern GPUs. In this paper, we demonstrate that real-time rendering is possible by utilizing thousands of tiny MLPs instead of one single large MLP. In our setting, each individual MLP only needs to represent parts of the scene, thus smaller and faster-to-evaluate MLPs can be used. By combining this divide-and-conquer strategy with further optimizations, rendering is accelerated by three orders of magnitude compared to the original NeRF model without incurring high storage costs. Further, using teacher-student distillation for training, we show that this speed-up can be achieved without sacrificing visual quality.

KiloNeRF: Speeding up Neural Radiance Fields with Thousands of Tiny MLPs

TL;DR

KiloNeRF tackles NeRF's rendering bottleneck by decomposing a scene into a regular grid of thousands of tiny MLPs, each responsible for a small spatial cell. A three-stage training pipeline—teacher NeRF pretraining, distillation to initialize the tiny networks, and final fine-tuning—preserves visual fidelity while enabling dramatic speedups. By combining empty-space skipping, early ray termination, and specialized GPU-accelerated evaluation, the method delivers orders-of-magnitude faster renderings with comparable quality and modest storage requirements. The work demonstrates a practical pathway toward real-time neural radiance field rendering on consumer hardware and suggests avenues for further integration with other fast-NVS techniques.

Abstract

NeRF synthesizes novel views of a scene with unprecedented quality by fitting a neural radiance field to RGB images. However, NeRF requires querying a deep Multi-Layer Perceptron (MLP) millions of times, leading to slow rendering times, even on modern GPUs. In this paper, we demonstrate that real-time rendering is possible by utilizing thousands of tiny MLPs instead of one single large MLP. In our setting, each individual MLP only needs to represent parts of the scene, thus smaller and faster-to-evaluate MLPs can be used. By combining this divide-and-conquer strategy with further optimizations, rendering is accelerated by three orders of magnitude compared to the original NeRF model without incurring high storage costs. Further, using teacher-student distillation for training, we show that this speed-up can be achieved without sacrificing visual quality.

Paper Structure

This paper contains 14 sections, 5 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: KiloNeRF. Instead of representing the entire scene by a single, high-capacity MLP, we represent the scene by thousands of small MLPs. This allows us to render the scene above 2548x faster without loss in visual quality.
  • Figure 2: Model Architecture. KiloNeRF's MLP architecture is a downscaled version of NeRF's architecture. A forward pass through KiloNeRF's network only requires 1/87th of the floating point operations (FLOPs) of the original architecture.
  • Figure 3: Distillation. (\ref{['fig:from_scratch_a']}) Training KiloNeRF from scratch can lead to artifacts in free space. (\ref{['fig:from_scratch_b']}) Distillation by imitating a pre-trained standard NeRF model mitigates this issue.
  • Figure 4: Qualitative Comparison. Novel views synthesized by NeRF, NSVF and KiloNeRF. Despite being significantly faster, KiloNeRF attains the visual quality of the baselines. The numbers in the top-right corner correspond to the average render time of the respective technique on that scene. The rendered image resolution (in pixels) is specified on the left.
  • Figure 5: Ablation Study. Closeups of KiloNeRF on the Lego bulldozer scene, varying different parameters of the model. The numbers in the bottom-right corner correspond to perceptual similarity (LPIPS) wrt. the ground truth, lower is better.
  • ...and 1 more figures