Table of Contents
Fetching ...

Depth-Guided Bundle Sampling for Efficient Generalizable Neural Radiance Field Reconstruction

Li Fang, Hao Zhu, Longlong Chen, Fei Hu, Long Ye, Zhan Ma

TL;DR

This work tackles the high computational cost of rendering high-resolution views in generalizable NeRFs by introducing depth-guided bundle sampling, which groups adjacent rays into cones and samples them jointly with depth-aware adaptation. It leverages plenoptic sampling principles to concentrate sampling near depth surfaces and reduces overall ray counts without sacrificing quality, applying the method to ENeRF and MVSGaussian. Experimental results on DTU, Real Forward-facing, and NeRF Synthetic datasets show state-of-the-art or competitive quality with substantial speedups, including up to $2\times$ faster rendering and notable PSNR/SSIM/LPIPS improvements. The approach offers a flexible speed-accuracy trade-off via bundle size and depth-guided sampling, with robust ablations confirming the value of sphere-based bundle sampling and joint bundle plus ray-specific representations, while acknowledging limitations in depth estimation-heavy or object-centric scenes.

Abstract

Recent advancements in generalizable novel view synthesis have achieved impressive quality through interpolation between nearby views. However, rendering high-resolution images remains computationally intensive due to the need for dense sampling of all rays. Recognizing that natural scenes are typically piecewise smooth and sampling all rays is often redundant, we propose a novel depth-guided bundle sampling strategy to accelerate rendering. By grouping adjacent rays into a bundle and sampling them collectively, a shared representation is generated for decoding all rays within the bundle. To further optimize efficiency, our adaptive sampling strategy dynamically allocates samples based on depth confidence, concentrating more samples in complex regions while reducing them in smoother areas. When applied to ENeRF, our method achieves up to a 1.27 dB PSNR improvement and a 47% increase in FPS on the DTU dataset. Extensive experiments on synthetic and real-world datasets demonstrate state-of-the-art rendering quality and up to 2x faster rendering compared to existing generalizable methods. Code is available at https://github.com/KLMAV-CUC/GDB-NeRF.

Depth-Guided Bundle Sampling for Efficient Generalizable Neural Radiance Field Reconstruction

TL;DR

This work tackles the high computational cost of rendering high-resolution views in generalizable NeRFs by introducing depth-guided bundle sampling, which groups adjacent rays into cones and samples them jointly with depth-aware adaptation. It leverages plenoptic sampling principles to concentrate sampling near depth surfaces and reduces overall ray counts without sacrificing quality, applying the method to ENeRF and MVSGaussian. Experimental results on DTU, Real Forward-facing, and NeRF Synthetic datasets show state-of-the-art or competitive quality with substantial speedups, including up to faster rendering and notable PSNR/SSIM/LPIPS improvements. The approach offers a flexible speed-accuracy trade-off via bundle size and depth-guided sampling, with robust ablations confirming the value of sphere-based bundle sampling and joint bundle plus ray-specific representations, while acknowledging limitations in depth estimation-heavy or object-centric scenes.

Abstract

Recent advancements in generalizable novel view synthesis have achieved impressive quality through interpolation between nearby views. However, rendering high-resolution images remains computationally intensive due to the need for dense sampling of all rays. Recognizing that natural scenes are typically piecewise smooth and sampling all rays is often redundant, we propose a novel depth-guided bundle sampling strategy to accelerate rendering. By grouping adjacent rays into a bundle and sampling them collectively, a shared representation is generated for decoding all rays within the bundle. To further optimize efficiency, our adaptive sampling strategy dynamically allocates samples based on depth confidence, concentrating more samples in complex regions while reducing them in smoother areas. When applied to ENeRF, our method achieves up to a 1.27 dB PSNR improvement and a 47% increase in FPS on the DTU dataset. Extensive experiments on synthetic and real-world datasets demonstrate state-of-the-art rendering quality and up to 2x faster rendering compared to existing generalizable methods. Code is available at https://github.com/KLMAV-CUC/GDB-NeRF.

Paper Structure

This paper contains 32 sections, 11 equations, 7 figures, 12 tables.

Figures (7)

  • Figure 1: Rendering Quality (PSNR) vs. Speed (FPS) on the DTU Dataset aanaes2016large under 3-view setting. ENeRF+Ours achieves state-of-the-art rendering quality and faster rendering speed, outperforming existing generalizable novel view synthesis methods wang2021ibrnetchen2021mvsnerflin2022efficientchen2023matchnerfxu2024murfliu2024mvsgaussian. Model parameter counts are also provided.
  • Figure 2: Network architecture of the proposed depth-guided bundle sampling strategy on ENeRF lin2022efficient, denoted as ENeRF+Ours, which consists of four main components: (1) Multi-scale feature extraction; (2) Depth estimation to predict depth range; (3) Depth-guided bundle sampling, where rays are grouped into bundles and sampled adaptively based on predicted depth confidence; and (4) Radiance field prediction, which decodes each bundle's representation into individual ray colors.
  • Figure 3: Comparison of ray sampling strategies: (a) Traditional ray sampling, which processes each ray individually; (b) Proposed bundle sampling, which jointly samples neighboring rays to leverage scene coherence.
  • Figure 4: Qualitative comparison of ENeRF+Ours ($2 \times 2$) with state-of-the-art methods lin2022efficientchen2023matchnerfliu2024mvsgaussian under 3-view setting. Each image triplet includes: the reconstructed image on the left, a zoomed-in view on the upper right, and the error map on the lower right.
  • Figure 5: Visualization of sample allocation in depth-guided adaptive sampling.
  • ...and 2 more figures