Table of Contents
Fetching ...

Spurfies: Sparse Surface Reconstruction using Local Geometry Priors

Kevin Raj, Christopher Wewer, Raza Yunus, Eddy Ilg, Jan Eric Lenssen

TL;DR

Spurfies tackles sparse-view surface reconstruction by learning a local geometry prior from a small synthetic ShapeNet subset and coupling it with a disentangled neural point representation for geometry and appearance. The geometry prior is trained with SDF, Eikonal, and Total Variation losses, and is kept frozen during inference to regularize the per-scene optimization of appearance and surface geometry via differentiable volume rendering. A neural point cloud with local processing regresses SDF and radiance from per-point latent codes, enabling effective reconstruction from only a few views and even generalizing to large unbounded scenes like Mip-NeRF360. The method achieves a 35% improvement in Chamfer Distance on DTU over prior sparse-view methods and delivers competitive novel-view synthesis, demonstrating the value of decoupling geometry priors and appearance for data-efficient 3D reconstruction.

Abstract

We introduce Spurfies, a novel method for sparse-view surface reconstruction that disentangles appearance and geometry information to utilize local geometry priors trained on synthetic data. Recent research heavily focuses on 3D reconstruction using dense multi-view setups, typically requiring hundreds of images. However, these methods often struggle with few-view scenarios. Existing sparse-view reconstruction techniques often rely on multi-view stereo networks that need to learn joint priors for geometry and appearance from a large amount of data. In contrast, we introduce a neural point representation that disentangles geometry and appearance to train a local geometry prior using a subset of the synthetic ShapeNet dataset only. During inference, we utilize this surface prior as additional constraint for surface and appearance reconstruction from sparse input views via differentiable volume rendering, restricting the space of possible solutions. We validate the effectiveness of our method on the DTU dataset and demonstrate that it outperforms previous state of the art by 35% in surface quality while achieving competitive novel view synthesis quality. Moreover, in contrast to previous works, our method can be applied to larger, unbounded scenes, such as Mip-NeRF 360.

Spurfies: Sparse Surface Reconstruction using Local Geometry Priors

TL;DR

Spurfies tackles sparse-view surface reconstruction by learning a local geometry prior from a small synthetic ShapeNet subset and coupling it with a disentangled neural point representation for geometry and appearance. The geometry prior is trained with SDF, Eikonal, and Total Variation losses, and is kept frozen during inference to regularize the per-scene optimization of appearance and surface geometry via differentiable volume rendering. A neural point cloud with local processing regresses SDF and radiance from per-point latent codes, enabling effective reconstruction from only a few views and even generalizing to large unbounded scenes like Mip-NeRF360. The method achieves a 35% improvement in Chamfer Distance on DTU over prior sparse-view methods and delivers competitive novel-view synthesis, demonstrating the value of decoupling geometry priors and appearance for data-efficient 3D reconstruction.

Abstract

We introduce Spurfies, a novel method for sparse-view surface reconstruction that disentangles appearance and geometry information to utilize local geometry priors trained on synthetic data. Recent research heavily focuses on 3D reconstruction using dense multi-view setups, typically requiring hundreds of images. However, these methods often struggle with few-view scenarios. Existing sparse-view reconstruction techniques often rely on multi-view stereo networks that need to learn joint priors for geometry and appearance from a large amount of data. In contrast, we introduce a neural point representation that disentangles geometry and appearance to train a local geometry prior using a subset of the synthetic ShapeNet dataset only. During inference, we utilize this surface prior as additional constraint for surface and appearance reconstruction from sparse input views via differentiable volume rendering, restricting the space of possible solutions. We validate the effectiveness of our method on the DTU dataset and demonstrate that it outperforms previous state of the art by 35% in surface quality while achieving competitive novel view synthesis quality. Moreover, in contrast to previous works, our method can be applied to larger, unbounded scenes, such as Mip-NeRF 360.
Paper Structure (50 sections, 16 equations, 13 figures, 3 tables)

This paper contains 50 sections, 16 equations, 13 figures, 3 tables.

Figures (13)

  • Figure 1: Qualitative comparison of mesh reconstruction with the point-based mesh reconstruction methods. In contrast to our approach, point-based mesh reconstruction methods often show missing areas, even when initialized with DUST3R dust3r point clouds.
  • Figure 2: Method overview: 1) Preprocess: given a sparse set of input views, we make use of DUSt3R dust3r to predict points $\mathcal{P}$. Representation: The points serve as basis for a neural point representation that stores disentangled features $\mathbf{f}^a$, $\mathbf{f}^g$ for geometry and appearance on each point. Local Prior (top): We learn a local geometry prior $G_\textnormal{LP}$ & $G_\textnormal{REG}$ over a subset of shapes from the synthetic ShapeNet dataset shapenet by optimizing to predict ground truth SDF. 3) Spurfies (bottom): We make use of the prior for surface reconstruction from sparse images, only optimizing the latent codes $\mathbf{f}^a, \mathbf{f}^g$ and the color MLPs $A_\textnormal{LP}$ & $A_\textnormal{REG}$ to reconstruct images via volume rendering.
  • Figure 2: Sampled points from the reconstructed mesh on few scans from DTU dataset.
  • Figure 3: Qualitative mesh reconstruction comparison on DTU. Compared to previous state-of-the-art sparse-view methods, our reconstruction demonstrates superior completeness in regions with less view overlap. Our closest competitor is NeuSurf, which also reconstructs high quality surfaces on the object-centric DTU scenes. However, it fails to generalize to larger scenes (c.f. Fig. \ref{['fig:mesh-mip']}).
  • Figure 3: Qualitative mesh reconstruction on Mip-NeRF 360 mipnerf. Compared to previous sparse view methods, we can achieve much better reconstruction on larger, unbounded scenes. S-VolSDF completely failed on the stump scene.
  • ...and 8 more figures