Table of Contents
Fetching ...

Learning with Unreliability: Fast Few-shot Voxel Radiance Fields with Relative Geometric Consistency

Yingjie Xu, Bangzhen Liu, Hao Tang, Bailin Deng, Shengfeng He

TL;DR

This work tackles the challenge of reconstructing radiance fields from very sparse views by treating unreliable warped regions not as noise but as informative cues. The authors introduce ReVoRF, a voxel-based method that uses a bilateral geometric consistency loss to jointly leverage reliable color/density signals and unreliable regions guided by relative depth priors, complemented by reliability-aware voxel smoothing and learning adjustment. Empirical results show that ReVoRF achieves faster training (e.g., a few minutes for a $360^\circ$ scene) and improved PSNR/LPIPS/SSIM metrics compared with prior few-shot NeRF methods, demonstrating enhanced cross-view consistency and geometry with limited data. The approach promises practical benefits for real-world 3D reconstruction under sparse observations, while acknowledging smoothing-induced detail loss and suggesting future hybrid representations to preserve fine details.

Abstract

We propose a voxel-based optimization framework, ReVoRF, for few-shot radiance fields that strategically address the unreliability in pseudo novel view synthesis. Our method pivots on the insight that relative depth relationships within neighboring regions are more reliable than the absolute color values in disoccluded areas. Consequently, we devise a bilateral geometric consistency loss that carefully navigates the trade-off between color fidelity and geometric accuracy in the context of depth consistency for uncertain regions. Moreover, we present a reliability-guided learning strategy to discern and utilize the variable quality across synthesized views, complemented by a reliability-aware voxel smoothing algorithm that smoothens the transition between reliable and unreliable data patches. Our approach allows for a more nuanced use of all available data, promoting enhanced learning from regions previously considered unsuitable for high-quality reconstruction. Extensive experiments across diverse datasets reveal that our approach attains significant gains in efficiency and accuracy, delivering rendering speeds of 3 FPS, 7 mins to train a $360^\circ$ scene, and a 5\% improvement in PSNR over existing few-shot methods. Code is available at https://github.com/HKCLynn/ReVoRF.

Learning with Unreliability: Fast Few-shot Voxel Radiance Fields with Relative Geometric Consistency

TL;DR

This work tackles the challenge of reconstructing radiance fields from very sparse views by treating unreliable warped regions not as noise but as informative cues. The authors introduce ReVoRF, a voxel-based method that uses a bilateral geometric consistency loss to jointly leverage reliable color/density signals and unreliable regions guided by relative depth priors, complemented by reliability-aware voxel smoothing and learning adjustment. Empirical results show that ReVoRF achieves faster training (e.g., a few minutes for a scene) and improved PSNR/LPIPS/SSIM metrics compared with prior few-shot NeRF methods, demonstrating enhanced cross-view consistency and geometry with limited data. The approach promises practical benefits for real-world 3D reconstruction under sparse observations, while acknowledging smoothing-induced detail loss and suggesting future hybrid representations to preserve fine details.

Abstract

We propose a voxel-based optimization framework, ReVoRF, for few-shot radiance fields that strategically address the unreliability in pseudo novel view synthesis. Our method pivots on the insight that relative depth relationships within neighboring regions are more reliable than the absolute color values in disoccluded areas. Consequently, we devise a bilateral geometric consistency loss that carefully navigates the trade-off between color fidelity and geometric accuracy in the context of depth consistency for uncertain regions. Moreover, we present a reliability-guided learning strategy to discern and utilize the variable quality across synthesized views, complemented by a reliability-aware voxel smoothing algorithm that smoothens the transition between reliable and unreliable data patches. Our approach allows for a more nuanced use of all available data, promoting enhanced learning from regions previously considered unsuitable for high-quality reconstruction. Extensive experiments across diverse datasets reveal that our approach attains significant gains in efficiency and accuracy, delivering rendering speeds of 3 FPS, 7 mins to train a scene, and a 5\% improvement in PSNR over existing few-shot methods. Code is available at https://github.com/HKCLynn/ReVoRF.
Paper Structure (26 sections, 15 equations, 4 figures, 4 tables)

This paper contains 26 sections, 15 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: Overview of our proposed ReVoRF. Specifically, we first warp the sparse images onto several novel views and determine both the dependable and unreliable regions. Based on the dependability of each image region, we introduce a bilateral geometric consistency loss for multi-view consistent learning, which is composed of a color and density regularization term for reliable regions and a relative depth consistency term for unreliable regions. These two terms are responsible for explicitly learning the reliable geometric contents and implicitly exploring the geometric consistency via the guidance of relative depth, respectively. For voxel feature regularization, we integrate the unreliability through a reliability-guided learning strategy and a reliability-aware voxel smoothing procedure. By prioritizing the learning of more reliable regions and mitigating the inconsistencies in less reliable ones, ReVoRF ensures a more balanced and coherent reconstruction.
  • Figure 2: 4-views reconstructions on Realistic Synthetic 360° MildenhallSTBRN20. ReVoRF enables more consistent reconstruction with detailed appearance.
  • Figure 3: Comparisons on the LLFF dataset MildenhallSCKRN19 in 3-view setting. The red and blue boxes denote compared regions. Our approach achieves better results in reconstructing fine details with enhanced clarity. Please zoom in for details.
  • Figure 4: Visualizations of the ablation on Chair scene from the Realistic Synthetic 360° MildenhallSTBRN20 dataset in 4 views setting. With the proposed losses, our methods could gradually improve the cross-view consistency and reduce the noise compared with the baseline.