Table of Contents
Fetching ...

Learning Robust Generalizable Radiance Field with Visibility and Feature Augmented Point Representation

Jiaxu Wang, Ziyi Zhang, Renjing Xu

TL;DR

This work addresses the limitations of existing generalizable NeRFs by introducing a point-based Generalizable neural Point Field (GPF) that explicitly models visibilities with geometric priors and augments them with neural features. It introduces a density-guided nonuniform log sampling strategy and a feature-augmented learnable kernel to robustly aggregate features, along with a three-stage hierarchical finetuning procedure that enables generalization without per-scene optimization. Across NeRF Synthetic, DTU, and BlendedMVS datasets, the method achieves superior geometry, view consistency, and rendering quality under both generalization and finetuning settings, surpassing image-based baselines and other point-based approaches. The approach also enables interactive manipulation of the neural point field, highlighting a practical and flexible direction for generalizable NeRFs and neural rendering at large.

Abstract

This paper introduces a novel paradigm for the generalizable neural radiance field (NeRF). Previous generic NeRF methods combine multiview stereo techniques with image-based neural rendering for generalization, yielding impressive results, while suffering from three issues. First, occlusions often result in inconsistent feature matching. Then, they deliver distortions and artifacts in geometric discontinuities and locally sharp shapes due to their individual process of sampled points and rough feature aggregation. Third, their image-based representations experience severe degradations when source views are not near enough to the target view. To address challenges, we propose the first paradigm that constructs the generalizable neural field based on point-based rather than image-based rendering, which we call the Generalizable neural Point Field (GPF). Our approach explicitly models visibilities by geometric priors and augments them with neural features. We propose a novel nonuniform log sampling strategy to improve both rendering speed and reconstruction quality. Moreover, we present a learnable kernel spatially augmented with features for feature aggregations, mitigating distortions at places with drastically varying geometries. Besides, our representation can be easily manipulated. Experiments show that our model can deliver better geometries, view consistencies, and rendering quality than all counterparts and benchmarks on three datasets in both generalization and finetuning settings, preliminarily proving the potential of the new paradigm for generalizable NeRF.

Learning Robust Generalizable Radiance Field with Visibility and Feature Augmented Point Representation

TL;DR

This work addresses the limitations of existing generalizable NeRFs by introducing a point-based Generalizable neural Point Field (GPF) that explicitly models visibilities with geometric priors and augments them with neural features. It introduces a density-guided nonuniform log sampling strategy and a feature-augmented learnable kernel to robustly aggregate features, along with a three-stage hierarchical finetuning procedure that enables generalization without per-scene optimization. Across NeRF Synthetic, DTU, and BlendedMVS datasets, the method achieves superior geometry, view consistency, and rendering quality under both generalization and finetuning settings, surpassing image-based baselines and other point-based approaches. The approach also enables interactive manipulation of the neural point field, highlighting a practical and flexible direction for generalizable NeRFs and neural rendering at large.

Abstract

This paper introduces a novel paradigm for the generalizable neural radiance field (NeRF). Previous generic NeRF methods combine multiview stereo techniques with image-based neural rendering for generalization, yielding impressive results, while suffering from three issues. First, occlusions often result in inconsistent feature matching. Then, they deliver distortions and artifacts in geometric discontinuities and locally sharp shapes due to their individual process of sampled points and rough feature aggregation. Third, their image-based representations experience severe degradations when source views are not near enough to the target view. To address challenges, we propose the first paradigm that constructs the generalizable neural field based on point-based rather than image-based rendering, which we call the Generalizable neural Point Field (GPF). Our approach explicitly models visibilities by geometric priors and augments them with neural features. We propose a novel nonuniform log sampling strategy to improve both rendering speed and reconstruction quality. Moreover, we present a learnable kernel spatially augmented with features for feature aggregations, mitigating distortions at places with drastically varying geometries. Besides, our representation can be easily manipulated. Experiments show that our model can deliver better geometries, view consistencies, and rendering quality than all counterparts and benchmarks on three datasets in both generalization and finetuning settings, preliminarily proving the potential of the new paradigm for generalizable NeRF.
Paper Structure (29 sections, 14 equations, 21 figures, 8 tables)

This paper contains 29 sections, 14 equations, 21 figures, 8 tables.

Figures (21)

  • Figure 1: Our approach produces sharper and clearer at discontinuous geometries in an unobserved scenario without per-scene training and synthesizes higher quality images than baselines.
  • Figure 2: The Overview pipeline with our model. (a) depicts the hierarchical feature extraction. (b) is visibility-oriented feature fetching. (c) denotes the density-guided robust log sampling. (d) illustrates feature aggregation by the feature-augmented learnable kernel.
  • Figure 3: Qualitative comparisons of novel view synthesis under generalization setting.
  • Figure 4: Qualitative Comparisons of novel view synthesis under finetuning setting.
  • Figure 5: Qualitative Comparisons between ours and the implicit occlusion-aware image-based rendering method: NeuRay, including the rendering views and the reconstructed geometries. The green box is the groundtruth at the right side of the figure.
  • ...and 16 more figures