Table of Contents
Fetching ...

HyperPlanes: Hypernetwork Approach to Rapid NeRF Adaptation

Paweł Batorski, Dawid Malarz, Marcin Przewięźlikowski, Marcin Mazur, Sławomir Tadeja, Przemysław Spurek

TL;DR

HyperPlanes introduces a hypernetwork-based, one-shot adaptation strategy for neural radiance fields (NeRFs) that eliminates gradient-based inference during testing. By encoding information from a small set of support views with a hypernetwork, the method generates a task-specific weight update for a lightweight target decoder (PointMultiPlaneNeRF), enabling instant NeRF representations for unseen objects. Empirical results on ShapeNet demonstrate substantial PSNR gains and dramatic reductions in adaptation time compared with gradient-based NeRF methods, supported by comprehensive ablations. The approach promises rapid, high-quality 3D reconstruction from few images, with clear potential for real-time content creation in VR/AR and related applications.

Abstract

Neural radiance fields (NeRFs) are a widely accepted standard for synthesizing new 3D object views from a small number of base images. However, NeRFs have limited generalization properties, which means that we need to use significant computational resources to train individual architectures for each item we want to represent. To address this issue, we propose a few-shot learning approach based on the hypernetwork paradigm that does not require gradient optimization during inference. The hypernetwork gathers information from the training data and generates an update for universal weights. As a result, we have developed an efficient method for generating a high-quality 3D object representation from a small number of images in a single step. This has been confirmed by direct comparison with the state-of-the-art solutions and a comprehensive ablation study.

HyperPlanes: Hypernetwork Approach to Rapid NeRF Adaptation

TL;DR

HyperPlanes introduces a hypernetwork-based, one-shot adaptation strategy for neural radiance fields (NeRFs) that eliminates gradient-based inference during testing. By encoding information from a small set of support views with a hypernetwork, the method generates a task-specific weight update for a lightweight target decoder (PointMultiPlaneNeRF), enabling instant NeRF representations for unseen objects. Empirical results on ShapeNet demonstrate substantial PSNR gains and dramatic reductions in adaptation time compared with gradient-based NeRF methods, supported by comprehensive ablations. The approach promises rapid, high-quality 3D reconstruction from few images, with clear potential for real-time content creation in VR/AR and related applications.

Abstract

Neural radiance fields (NeRFs) are a widely accepted standard for synthesizing new 3D object views from a small number of base images. However, NeRFs have limited generalization properties, which means that we need to use significant computational resources to train individual architectures for each item we want to represent. To address this issue, we propose a few-shot learning approach based on the hypernetwork paradigm that does not require gradient optimization during inference. The hypernetwork gathers information from the training data and generates an update for universal weights. As a result, we have developed an efficient method for generating a high-quality 3D object representation from a small number of images in a single step. This has been confirmed by direct comparison with the state-of-the-art solutions and a comprehensive ablation study.
Paper Structure (41 sections, 9 equations, 14 figures, 5 tables)

This paper contains 41 sections, 9 equations, 14 figures, 5 tables.

Figures (14)

  • Figure 1: Our HyperPlanes model can ( in a single update) produce 3D objects that achieve a superior PSNR ($\uparrow$) when compared to those generated by a vanilla NeRF trained for approximately 36000 epochs. This results in $380 \times speedup$ (10 seconds for adapting and rendering with HyperPlanes vs. 63.7 minutes for training NeRF) in object reconstruction speed while achieving better or comparable quality.
  • Figure 2: Visualization of data utilized by HyperPlanes (left), and architecture (right).
  • Figure 3: Sample images from the MultiPlaneNeRF and HyperPlanes-100 models trained on the ShapeNet 200$\times$200 dataset, showing reconstructed Cars, Chairs, and Planes alongside their ground truths. All images were taken from the same viewing direction. For more examples, see Figure \ref{['fig:reconstruction_big']} in the appendix.
  • Figure 4: Comparison of PSNR ($\uparrow$) of the HyperPlanes training process when using different target network architectures (NeRF, MultiPlaneNeRF, and PointMultiPlaneNeRF) and without weights and viewing directions in the hypernetwork input. Results (averaged over 5 runs) were obtained on the car class of the ShapeNet 128$\times$128 dataset after 40 epochs. For extended results, see Figure \ref{['fig:multiplane_vs_nerf_full']} in the appendix.
  • Figure 5: Boxplots for PSNR ($\uparrow$) final values obtained by the HyperPlanes model with different numbers of HyperPlanes. Results were obtained on the planes class of the ShapeNet 200$\times$200 dataset after 40 epochs. For extended results, see Figure \ref{['fig:hyperplanes_full']} in the appendix.
  • ...and 9 more figures