Table of Contents
Fetching ...

High-Fidelity Mask-free Neural Surface Reconstruction for Virtual Reality

Haotian Bai, Yize Chen, Lin Wang

TL;DR

Hi-NeuS is presented, a novel rendering-based framework for neural implicit surface reconstruction, aiming to recover compact and precise surfaces without multi-view object masks, and has been validated through NeuS and its variant Neuralangelo, demonstrating its adaptability across different NeuS backbones.

Abstract

Object-centric surface reconstruction from multi-view images is crucial in creating editable digital assets for AR/VR. Due to the lack of geometric constraints, existing methods, e.g., NeuS necessitate annotating the object masks to reconstruct compact surfaces in mesh processing. Mask annotation, however, incurs considerable labor costs due to its cumbersome nature. This paper presents Hi-NeuS, a novel rendering-based framework for neural implicit surface reconstruction, aiming to recover compact and precise surfaces without multi-view object masks. Our key insight is that the overlapping regions in the object-centric views naturally highlight the object of interest as the camera orbits around objects. The object of interest can be specified by estimating the distribution of the rendering weights accumulated from multiple views, which implicitly identifies the surface that a user intends to capture. This inspires us to design a geometric refinement approach, which takes multi-view rendering weights to guide the signed distance functions (SDF) of neural surfaces in a self-supervised manner. Specifically, it retains these weights to resample a pseudo surface based on their distribution. This facilitates the alignment of the SDF to the object of interest. We then regularize the SDF's bias for geometric consistency. Moreover, we propose to use unmasked Chamfer Distance(CD) to measure the extracted mesh without post-processing for more precise evaluation. Our approach has been validated through NeuS and its variant Neuralangelo, demonstrating its adaptability across different NeuS backbones. Extensive benchmark on the DTU dataset shows that our method reduces surface noise by about 20%, and improves the unmasked CD by around 30%, achieving better surface details. The superiority of Hi-NeuS is further validated on BlendedMVS and handheld camera captures for content creation.

High-Fidelity Mask-free Neural Surface Reconstruction for Virtual Reality

TL;DR

Hi-NeuS is presented, a novel rendering-based framework for neural implicit surface reconstruction, aiming to recover compact and precise surfaces without multi-view object masks, and has been validated through NeuS and its variant Neuralangelo, demonstrating its adaptability across different NeuS backbones.

Abstract

Object-centric surface reconstruction from multi-view images is crucial in creating editable digital assets for AR/VR. Due to the lack of geometric constraints, existing methods, e.g., NeuS necessitate annotating the object masks to reconstruct compact surfaces in mesh processing. Mask annotation, however, incurs considerable labor costs due to its cumbersome nature. This paper presents Hi-NeuS, a novel rendering-based framework for neural implicit surface reconstruction, aiming to recover compact and precise surfaces without multi-view object masks. Our key insight is that the overlapping regions in the object-centric views naturally highlight the object of interest as the camera orbits around objects. The object of interest can be specified by estimating the distribution of the rendering weights accumulated from multiple views, which implicitly identifies the surface that a user intends to capture. This inspires us to design a geometric refinement approach, which takes multi-view rendering weights to guide the signed distance functions (SDF) of neural surfaces in a self-supervised manner. Specifically, it retains these weights to resample a pseudo surface based on their distribution. This facilitates the alignment of the SDF to the object of interest. We then regularize the SDF's bias for geometric consistency. Moreover, we propose to use unmasked Chamfer Distance(CD) to measure the extracted mesh without post-processing for more precise evaluation. Our approach has been validated through NeuS and its variant Neuralangelo, demonstrating its adaptability across different NeuS backbones. Extensive benchmark on the DTU dataset shows that our method reduces surface noise by about 20%, and improves the unmasked CD by around 30%, achieving better surface details. The superiority of Hi-NeuS is further validated on BlendedMVS and handheld camera captures for content creation.
Paper Structure (23 sections, 11 equations, 16 figures, 3 tables, 1 algorithm)

This paper contains 23 sections, 11 equations, 16 figures, 3 tables, 1 algorithm.

Figures (16)

  • Figure 1: (a) Comparison on surface reconstruction without masks. We aim to reduce the noise in surface reconstruction without relying on multi-view object masks. Compared to existing methods, including NeuS Wang_Liu_Liu_Theobalt_Komura_Wang_2021, Neuralangelo Neuralangelo, and Gaussian Surfels dai2024highquality, Hi-NeuS produces compact and more precise mesh results, enhancing its utility for downstream applications in virtual reality. (b) Self-supervised geometry refinement. The rendering weights from multiple views are accumulated, corresponding to resampled target surface points $x_t$(red). Based on this supervision, we advance query points $x_i$ (blue) to obtain the predicted surface points $x_q$ (green). We then align them using Chamfer Distance (CD) with global geometric constraints related to SDF.
  • Figure 2: Our proposed Hi-NeuS training framework: In volume rendering combined with geometry learning, we capture rendering weights from multiple views. Hi-NeuS then resamples based on the weight distribution to obtain supervisory surface points. Finally, global geometric refinement is applied using geometric constraints.
  • Figure 3: Qualitative comparison of Hi-NeuS: (a) Adaptability. We integrate our geometry refinement with NeuS and Neuralangelo to prove its adaptability. The magnified boxes reveal the recovered details. (b) Compactness. We compare our Hi-NeuS with the NeuS backbone against existing methods. The corner box of each image displays the highlighted areas outside the visual hull, denoted as mesh noise.
  • Figure 4: The mesh post-processing and its evaluation: Mesh noise refers to the space ratio outside the 3D visual hull created by silhouettes. The dashed circles highlight the areas where space is missing and must be evaluated. To evaluate this, we use sampled point clouds and GT point clouds to calculate the CD between the two. We compare the range of space to be evaluated between our proposed unmasked CD (red arrows) and the masked CD used in previous methods NeuralangeloWang_Liu_Liu_Theobalt_Komura_Wang_2021Oechsle_Peng_Geiger_2021Yariv_Kasten_Moran_Galun_Atzmon_Basri_Lipman_2020 (blue arrows).
  • Figure 5: Ablation study on proposed losses: performance evaluation on scan 40 in the DTU dataset and the average results across the DTU dataset, with their performance comparisons relative to NeuS. The boxes emphasize the difference in mesh quality.
  • ...and 11 more figures