Table of Contents
Fetching ...

Spatial Annealing for Efficient Few-shot Neural Rendering

Yuru Xiao, Deming Zhai, Wenbo Zhao, Kui Jiang, Junjun Jiang, Xianming Liu

TL;DR

The paper tackles the challenge of few-shot neural rendering with NeRF by introducing SANeRF, a spatial-annealing regularization designed for hybrid representations. By deriving the frequency bandwidth for pre-filtered fields and coupling it with a spatial annealing strategy, SANeRF modulates the sampling kernel to exponentially shrink the spatial search space, effectively controlling the bandwidth $\omega_m = f_s 2^x$ and the associated sampling radius $\tau = \dfrac{2\sqrt{\pi}}{f_s 2^x}$, while also adjusting a level-based sphere radius. This approach yields a simple, drop-in one-line change to base architectures (e.g., TriMipRF/MipNeRF) and achieves state-of-the-art results in few-shot settings, outperforming FreeNeRF by $0.3$ dB in PSNR on Blender and delivering up to $700\times$ faster reconstruction. Ablation studies confirm complementary gains from SA and SHM, with SA improving geometry and SHM reducing color artifacts, underscoring SANeRF’s practical impact for robust, efficient few-shot rendering.

Abstract

Neural Radiance Fields (NeRF) with hybrid representations have shown impressive capabilities for novel view synthesis, delivering high efficiency. Nonetheless, their performance significantly drops with sparse input views. Various regularization strategies have been devised to address these challenges. However, these strategies either require additional rendering costs or involve complex pipeline designs, leading to a loss of training efficiency. Although FreeNeRF has introduced an efficient frequency annealing strategy, its operation on frequency positional encoding is incompatible with the efficient hybrid representations. In this paper, we introduce an accurate and efficient few-shot neural rendering method named \textbf{S}patial \textbf{A}nnealing regularized \textbf{NeRF} (\textbf{SANeRF}), which adopts the pre-filtering design of a hybrid representation. We initially establish the analytical formulation of the frequency band limit for a hybrid architecture by deducing its filtering process. Based on this analysis, we propose a universal form of frequency annealing in the spatial domain, which can be implemented by modulating the sampling kernel to exponentially shrink from an initial one with a narrow grid tangent kernel spectrum. This methodology is crucial for stabilizing the early stages of the training phase and significantly contributes to enhancing the subsequent process of detail refinement. Our extensive experiments reveal that, by adding merely one line of code, SANeRF delivers superior rendering quality and much faster reconstruction speed compared to current few-shot neural rendering methods. Notably, SANeRF outperforms FreeNeRF on the Blender dataset, achieving 700$\times$ faster reconstruction speed.

Spatial Annealing for Efficient Few-shot Neural Rendering

TL;DR

The paper tackles the challenge of few-shot neural rendering with NeRF by introducing SANeRF, a spatial-annealing regularization designed for hybrid representations. By deriving the frequency bandwidth for pre-filtered fields and coupling it with a spatial annealing strategy, SANeRF modulates the sampling kernel to exponentially shrink the spatial search space, effectively controlling the bandwidth and the associated sampling radius , while also adjusting a level-based sphere radius. This approach yields a simple, drop-in one-line change to base architectures (e.g., TriMipRF/MipNeRF) and achieves state-of-the-art results in few-shot settings, outperforming FreeNeRF by dB in PSNR on Blender and delivering up to faster reconstruction. Ablation studies confirm complementary gains from SA and SHM, with SA improving geometry and SHM reducing color artifacts, underscoring SANeRF’s practical impact for robust, efficient few-shot rendering.

Abstract

Neural Radiance Fields (NeRF) with hybrid representations have shown impressive capabilities for novel view synthesis, delivering high efficiency. Nonetheless, their performance significantly drops with sparse input views. Various regularization strategies have been devised to address these challenges. However, these strategies either require additional rendering costs or involve complex pipeline designs, leading to a loss of training efficiency. Although FreeNeRF has introduced an efficient frequency annealing strategy, its operation on frequency positional encoding is incompatible with the efficient hybrid representations. In this paper, we introduce an accurate and efficient few-shot neural rendering method named \textbf{S}patial \textbf{A}nnealing regularized \textbf{NeRF} (\textbf{SANeRF}), which adopts the pre-filtering design of a hybrid representation. We initially establish the analytical formulation of the frequency band limit for a hybrid architecture by deducing its filtering process. Based on this analysis, we propose a universal form of frequency annealing in the spatial domain, which can be implemented by modulating the sampling kernel to exponentially shrink from an initial one with a narrow grid tangent kernel spectrum. This methodology is crucial for stabilizing the early stages of the training phase and significantly contributes to enhancing the subsequent process of detail refinement. Our extensive experiments reveal that, by adding merely one line of code, SANeRF delivers superior rendering quality and much faster reconstruction speed compared to current few-shot neural rendering methods. Notably, SANeRF outperforms FreeNeRF on the Blender dataset, achieving 700 faster reconstruction speed.
Paper Structure (13 sections, 18 equations, 8 figures, 3 tables)

This paper contains 13 sections, 18 equations, 8 figures, 3 tables.

Figures (8)

  • Figure 1: An overview of the complete framework. We introduce an efficient few-shot neural rendering method utilizing TriMipRF. Initially, we set the sample sphere's radius larger than that of the base sphere to optimize low-frequency geometry, as depicted in the bottom left corner. During training, we progressively reduce the sphere's radius through exponential decay, thereby refining local details within the reconstructed global structure.
  • Figure 2: Comparison results during the training procedure. The training loss curve reveals that TriMipRF exhibits premature convergence early, resulting in the underfitting of the geometry depicted on the left. Conversely, our spatial annealing strategy effectively addresses this challenge.
  • Figure 3: Visualization of GTK and its spectrum. We vary the size $r$ of the sampling sphere and measure the 1-D grid tangent kernel along with its frequency spectrum. The GTK spectrum indicates that a larger $r$, with a larger sampling space, corresponds to a narrower spectrum.
  • Figure 4: Qualitative rendered depth results with varying size of sampling sphere.
  • Figure 5: Qualitative results on Blender. We present qualitative comparisons of our method with the base architecture TriMipRF and the FreeNeRF few-shot baseline, utilizing 8 input views consistent with the FreeNeRF configuration.
  • ...and 3 more figures