Table of Contents
Fetching ...

NeRF Director: Revisiting View Selection in Neural Volume Rendering

Wenhui Xiao, Rodrigo Santa Cruz, David Ahmedt-Aristizabal, Olivier Salvado, Clinton Fookes, Leo Lebrat

TL;DR

This work tackles the underexplored problem of training-view selection in neural volume rendering. It introduces NeRF Director, a unified framework featuring Farthest View Sampling (FVS) and Information Gain-based Sampling (IGS) to optimize the selection of training views, often achieving comparable or better rendering quality with fewer views and faster convergence. A robust evaluation protocol, including a novel uniform test set and coverage-based metrics, reveals that view diversity and distribution critically influence NeRF performance and ranking stability. Extensive experiments on NeRF Synthetic and TanksAndTemples demonstrate substantial gains over random, error-based, and uncertainty-guided baselines across InstantNGP and Plenoxels backbones, with practical implications for data efficiency and deployment. Limitations are acknowledged (object-centric assumptions and resolution uniformity), with future work pointing to unstructured setups and broader applicability of the framework.

Abstract

Neural Rendering representations have significantly contributed to the field of 3D computer vision. Given their potential, considerable efforts have been invested to improve their performance. Nonetheless, the essential question of selecting training views is yet to be thoroughly investigated. This key aspect plays a vital role in achieving high-quality results and aligns with the well-known tenet of deep learning: "garbage in, garbage out". In this paper, we first illustrate the importance of view selection by demonstrating how a simple rotation of the test views within the most pervasive NeRF dataset can lead to consequential shifts in the performance rankings of state-of-the-art techniques. To address this challenge, we introduce a unified framework for view selection methods and devise a thorough benchmark to assess its impact. Significant improvements can be achieved without leveraging error or uncertainty estimation but focusing on uniform view coverage of the reconstructed object, resulting in a training-free approach. Using this technique, we show that high-quality renderings can be achieved faster by using fewer views. We conduct extensive experiments on both synthetic datasets and realistic data to demonstrate the effectiveness of our proposed method compared with random, conventional error-based, and uncertainty-guided view selection.

NeRF Director: Revisiting View Selection in Neural Volume Rendering

TL;DR

This work tackles the underexplored problem of training-view selection in neural volume rendering. It introduces NeRF Director, a unified framework featuring Farthest View Sampling (FVS) and Information Gain-based Sampling (IGS) to optimize the selection of training views, often achieving comparable or better rendering quality with fewer views and faster convergence. A robust evaluation protocol, including a novel uniform test set and coverage-based metrics, reveals that view diversity and distribution critically influence NeRF performance and ranking stability. Extensive experiments on NeRF Synthetic and TanksAndTemples demonstrate substantial gains over random, error-based, and uncertainty-guided baselines across InstantNGP and Plenoxels backbones, with practical implications for data efficiency and deployment. Limitations are acknowledged (object-centric assumptions and resolution uniformity), with future work pointing to unstructured setups and broader applicability of the framework.

Abstract

Neural Rendering representations have significantly contributed to the field of 3D computer vision. Given their potential, considerable efforts have been invested to improve their performance. Nonetheless, the essential question of selecting training views is yet to be thoroughly investigated. This key aspect plays a vital role in achieving high-quality results and aligns with the well-known tenet of deep learning: "garbage in, garbage out". In this paper, we first illustrate the importance of view selection by demonstrating how a simple rotation of the test views within the most pervasive NeRF dataset can lead to consequential shifts in the performance rankings of state-of-the-art techniques. To address this challenge, we introduce a unified framework for view selection methods and devise a thorough benchmark to assess its impact. Significant improvements can be achieved without leveraging error or uncertainty estimation but focusing on uniform view coverage of the reconstructed object, resulting in a training-free approach. Using this technique, we show that high-quality renderings can be achieved faster by using fewer views. We conduct extensive experiments on both synthetic datasets and realistic data to demonstrate the effectiveness of our proposed method compared with random, conventional error-based, and uncertainty-guided view selection.
Paper Structure (44 sections, 11 equations, 14 figures, 2 tables, 3 algorithms)

This paper contains 44 sections, 11 equations, 14 figures, 2 tables, 3 algorithms.

Figures (14)

  • Figure 1: Ranking the rendering performance of four distinct models under various z-axis rotations of the test camera poses. Left: original test set. Right: proposed test set.
  • Figure 2: Visual comparison between original (left) and proposed (right) test set for Synthetic dataset. The top row visualizes the default (w/o. rotation) test cameras' distribution in the 3D space. The bottom displays the absolute difference of the coverage density measure between default and $90^\circ$. A lighter color indicates higher discrepancies in terms of the standard deviation $\sigma$.
  • Figure 3: Quantitative comparisons of rendering quality along with the increase of used training views sampled by different view selection methods. Top: results on the Synthetic dataset in terms of PSNR (a) and SSIM (b). Bottom: results on the TanksAndTemples dataset in terms of PSNR (c) and SSIM(d). Low-opacity lines present the results for each repetition, while high-opacity lines present the average result across five repetitions.
  • Figure 4: Ablation studies of the information type (a) and the sampling strategy (b) in , as well as different distance metrics in (c) on the TanksAndTemple dataset.
  • Figure 5: Comparison results between ($\mathbf{d}_{euc}$) and ($\mathbf{d}_{euc} + \mathbf{d}_{photo}$) in terms of PSNR on the scene Playground.
  • ...and 9 more figures