Table of Contents
Fetching ...

SSNeRF: Sparse View Semi-supervised Neural Radiance Fields with Augmentation

Xiao Cao, Beibei Lin, Bo Wang, Zhiyong Huang, Robby T. Tan

TL;DR

SSNeRF tackles sparse-view NeRF by introducing a semi-supervised teacher–student framework augmented with sparse-view-specific degradations. A confidence-map system combining epistemic uncertainty and HSV-based cues guides high-confidence pseudo-labels from a teacher to supervise a perturbed student, while progressively challenging augmentations expose and mitigate density-noise and blur. An EMA-updated teacher transfers denoising capabilities back, yielding robust novel-view synthesis with fewer artifacts in real and synthetic datasets. The approach demonstrates clear improvements over state-of-the-art sparse-view methods and offers practical resilience against sparse-view degradation in complex scenes.

Abstract

Sparse view NeRF is challenging because limited input images lead to an under constrained optimization problem for volume rendering. Existing methods address this issue by relying on supplementary information, such as depth maps. However, generating this supplementary information accurately remains problematic and often leads to NeRF producing images with undesired artifacts. To address these artifacts and enhance robustness, we propose SSNeRF, a sparse view semi supervised NeRF method based on a teacher student framework. Our key idea is to challenge the NeRF module with progressively severe sparse view degradation while providing high confidence pseudo labels. This approach helps the NeRF model become aware of noise and incomplete information associated with sparse views, thus improving its robustness. The novelty of SSNeRF lies in its sparse view specific augmentations and semi supervised learning mechanism. In this approach, the teacher NeRF generates novel views along with confidence scores, while the student NeRF, perturbed by the augmented input, learns from the high confidence pseudo labels. Our sparse view degradation augmentation progressively injects noise into volume rendering weights, perturbs feature maps in vulnerable layers, and simulates sparse view blurriness. These augmentation strategies force the student NeRF to recognize degradation and produce clearer rendered views. By transferring the student's parameters to the teacher, the teacher gains increased robustness in subsequent training iterations. Extensive experiments demonstrate the effectiveness of our SSNeRF in generating novel views with less sparse view degradation. We will release code upon acceptance.

SSNeRF: Sparse View Semi-supervised Neural Radiance Fields with Augmentation

TL;DR

SSNeRF tackles sparse-view NeRF by introducing a semi-supervised teacher–student framework augmented with sparse-view-specific degradations. A confidence-map system combining epistemic uncertainty and HSV-based cues guides high-confidence pseudo-labels from a teacher to supervise a perturbed student, while progressively challenging augmentations expose and mitigate density-noise and blur. An EMA-updated teacher transfers denoising capabilities back, yielding robust novel-view synthesis with fewer artifacts in real and synthetic datasets. The approach demonstrates clear improvements over state-of-the-art sparse-view methods and offers practical resilience against sparse-view degradation in complex scenes.

Abstract

Sparse view NeRF is challenging because limited input images lead to an under constrained optimization problem for volume rendering. Existing methods address this issue by relying on supplementary information, such as depth maps. However, generating this supplementary information accurately remains problematic and often leads to NeRF producing images with undesired artifacts. To address these artifacts and enhance robustness, we propose SSNeRF, a sparse view semi supervised NeRF method based on a teacher student framework. Our key idea is to challenge the NeRF module with progressively severe sparse view degradation while providing high confidence pseudo labels. This approach helps the NeRF model become aware of noise and incomplete information associated with sparse views, thus improving its robustness. The novelty of SSNeRF lies in its sparse view specific augmentations and semi supervised learning mechanism. In this approach, the teacher NeRF generates novel views along with confidence scores, while the student NeRF, perturbed by the augmented input, learns from the high confidence pseudo labels. Our sparse view degradation augmentation progressively injects noise into volume rendering weights, perturbs feature maps in vulnerable layers, and simulates sparse view blurriness. These augmentation strategies force the student NeRF to recognize degradation and produce clearer rendered views. By transferring the student's parameters to the teacher, the teacher gains increased robustness in subsequent training iterations. Extensive experiments demonstrate the effectiveness of our SSNeRF in generating novel views with less sparse view degradation. We will release code upon acceptance.
Paper Structure (22 sections, 4 equations, 3 figures, 3 tables)

This paper contains 22 sections, 4 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Qualitative results on scene fern and drums under 3 training-view setting. Ours can effectively remove the hallucinations and floating points for rendered images. For zoom-in version, please refer to Figure \ref{['fig:zoom_in_visual']}. Results of more scenes can be found in supplementary material.
  • Figure 2: SSNeRF Framework. Our framework consists of two stages: (A) pretraining stage and (B) semi-supervised learning stage. The parameters are assigned to both teacher branch and student branch as initialization. Student branch is challenged with designed augmentation and supervised by teacher generated high-confidence pseudo-label together with sparse-view training data. The learned knowledge is passed to teacher by EMA, and at inference time, we remove all augmentations and only keep teacher NeRF.
  • Figure 3: Zoom-in qualitative results on scene fern, leaves, and trex with 3 training views. We select the regions that contain floating points with light red boxes and show the weakness by dark red circles. More can be found in supplementary materials.