Table of Contents
Fetching ...

VRP-UDF: Towards Unbiased Learning of Unsigned Distance Functions from Multi-view Images with Volume Rendering Priors

Wenyuan Zhang, Chunsheng Wang, Kanle Shi, Yu-Shen Liu, Zhizhong Han

TL;DR

This paper tackles the challenge of reconstructing open surfaces via unsigned distance functions from multi-view images, where existing differentiable renderers are biased and struggle with scalability. It introduces volume rendering priors, a learnable neural renderer that maps local unsigned distances to rendering weights, enabling unbiased depth and RGB rendering for UDF inference and generalizing to unseen scenes. The authors contribute a multi-resolution prior network, auxiliary sampling priors, and a uniform upsampling strategy to reduce hierarchical sampling bias, and demonstrate that the priors improve reconstruction quality across diverse benchmarks while also refining Gaussian splatting and extending to SDF and occupancy representations. The approach yields state-of-the-art results on ShapeNet, DF3D, DTU, Replica, Insects, and real scans, highlighting the practical impact for robust open-surface 3D reconstruction and flexible integration with alternative neural representations.

Abstract

Unsigned distance functions (UDFs) have been a vital representation for open surfaces. With different differentiable renderers, current methods are able to train neural networks to infer a UDF by minimizing the rendering errors with the UDF to the multi-view ground truth. However, these differentiable renderers are mainly handcrafted, which makes them either biased on ray-surface intersections, or sensitive to unsigned distance outliers, or not scalable to large scenes. To resolve these issues, we present a novel differentiable renderer to infer UDFs more accurately. Instead of using handcrafted equations, our differentiable renderer is a neural network which is pre-trained in a data-driven manner. It learns how to render unsigned distances into depth images, leading to a prior knowledge, dubbed volume rendering priors. To infer a UDF for an unseen scene from multiple RGB images, we generalize the learned volume rendering priors to map inferred unsigned distances in alpha blending for RGB image rendering. To reduce the bias of sampling in UDF inference, we utilize an auxiliary point sampling prior as an indicator of ray-surface intersection, and propose novel schemes towards more accurate and uniform sampling near the zero-level sets. We also propose a new strategy that leverages our pretrained volume rendering prior to serve as a general surface refiner, which can be integrated with various Gaussian reconstruction methods to optimize the Gaussian distributions and refine geometric details. Our results show that the learned volume rendering prior is unbiased, robust, scalable, 3D aware, and more importantly, easy to learn. Further experiments show that the volume rendering prior is also a general strategy to enhance other neural implicit representations such as signed distance function and occupancy.

VRP-UDF: Towards Unbiased Learning of Unsigned Distance Functions from Multi-view Images with Volume Rendering Priors

TL;DR

This paper tackles the challenge of reconstructing open surfaces via unsigned distance functions from multi-view images, where existing differentiable renderers are biased and struggle with scalability. It introduces volume rendering priors, a learnable neural renderer that maps local unsigned distances to rendering weights, enabling unbiased depth and RGB rendering for UDF inference and generalizing to unseen scenes. The authors contribute a multi-resolution prior network, auxiliary sampling priors, and a uniform upsampling strategy to reduce hierarchical sampling bias, and demonstrate that the priors improve reconstruction quality across diverse benchmarks while also refining Gaussian splatting and extending to SDF and occupancy representations. The approach yields state-of-the-art results on ShapeNet, DF3D, DTU, Replica, Insects, and real scans, highlighting the practical impact for robust open-surface 3D reconstruction and flexible integration with alternative neural representations.

Abstract

Unsigned distance functions (UDFs) have been a vital representation for open surfaces. With different differentiable renderers, current methods are able to train neural networks to infer a UDF by minimizing the rendering errors with the UDF to the multi-view ground truth. However, these differentiable renderers are mainly handcrafted, which makes them either biased on ray-surface intersections, or sensitive to unsigned distance outliers, or not scalable to large scenes. To resolve these issues, we present a novel differentiable renderer to infer UDFs more accurately. Instead of using handcrafted equations, our differentiable renderer is a neural network which is pre-trained in a data-driven manner. It learns how to render unsigned distances into depth images, leading to a prior knowledge, dubbed volume rendering priors. To infer a UDF for an unseen scene from multiple RGB images, we generalize the learned volume rendering priors to map inferred unsigned distances in alpha blending for RGB image rendering. To reduce the bias of sampling in UDF inference, we utilize an auxiliary point sampling prior as an indicator of ray-surface intersection, and propose novel schemes towards more accurate and uniform sampling near the zero-level sets. We also propose a new strategy that leverages our pretrained volume rendering prior to serve as a general surface refiner, which can be integrated with various Gaussian reconstruction methods to optimize the Gaussian distributions and refine geometric details. Our results show that the learned volume rendering prior is unbiased, robust, scalable, 3D aware, and more importantly, easy to learn. Further experiments show that the volume rendering prior is also a general strategy to enhance other neural implicit representations such as signed distance function and occupancy.
Paper Structure (37 sections, 8 equations, 25 figures, 9 tables)

This paper contains 37 sections, 8 equations, 25 figures, 9 tables.

Figures (25)

  • Figure 1: We highlight our multi-view reconstruction results from UDFs learned on real-captured open surface scenes and indoor scenes. The two sides of a surface are colored in white and beige, respectively. Comparing with NeuS wang2021neus and the state-of-the-art UDF reconstruction method NeUDF liu2023neudf, our method does not produce artifacts and recovers more accurate and smoother geometries on both open and closed surfaces.
  • Figure 2: Statistics of depth L1-error for various differentiable renderers. Each data point represents the mean depth L1-error computed between 100 predicted and GT depth maps of a random object from each category of ShapeNet.
  • Figure 3: Comparisons of estimated depth images and depth error maps among different differentiable renderers on one shape from the category of "tower" in ShapeNet.
  • Figure 4: Overview of our method. In the training phase, our volume rendering prior takes sliding windows of GT UDFs from training meshes as input, and outputs opaque densities for alpha blending. The parameters are optimized by the error between rendered depth and ground truth depth maps. During the testing phase, we freeze the volume rendering prior and use ground truth multi-view RGB images to optimize a randomly initialized UDF field.
  • Figure 5: Comparisons of different differentiable renderer structures. Existing methods use handcrafted equations to convert UDFs to opaque density. We extend the single-resolution MLP of VRPrior to multi-resolution MLPs, which further enhance the robustness and 3D awareness of volume rendering priors in neighborhood.
  • ...and 20 more figures