Table of Contents
Fetching ...

FatesGS: Fast and Accurate Sparse-View Surface Reconstruction using Gaussian Splatting with Depth-Feature Consistency

Han Huang, Yulun Wu, Chao Deng, Ge Gao, Ming Gu, Yu-Shen Liu

TL;DR

FatesGS tackles sparse-view surface reconstruction by extending Gaussian Splatting with two core ideas: intra-view depth consistency via patch-based monocular depth ranking and depth smoothing, and multi-view feature alignment to enforce cross-view coherence of depth-rendered points. It converts 3D Gaussians to 2D ellipses on a local tangent plane, rendering with a splatting pipeline and optimizing depth and color through learnable parameters. The method achieves state-of-the-art results on DTU and BlendedMVS with 60x–200x speedups and eliminates the need for long per-scene optimization or large-scale pre-training, demonstrating fast, fine-grained mesh reconstruction from sparse views. Overall, FatesGS provides a practical, efficient solution for sparse-view 3D reconstruction with robust geometric accuracy and high rendering fidelity.

Abstract

Recently, Gaussian Splatting has sparked a new trend in the field of computer vision. Apart from novel view synthesis, it has also been extended to the area of multi-view reconstruction. The latest methods facilitate complete, detailed surface reconstruction while ensuring fast training speed. However, these methods still require dense input views, and their output quality significantly degrades with sparse views. We observed that the Gaussian primitives tend to overfit the few training views, leading to noisy floaters and incomplete reconstruction surfaces. In this paper, we present an innovative sparse-view reconstruction framework that leverages intra-view depth and multi-view feature consistency to achieve remarkably accurate surface reconstruction. Specifically, we utilize monocular depth ranking information to supervise the consistency of depth distribution within patches and employ a smoothness loss to enhance the continuity of the distribution. To achieve finer surface reconstruction, we optimize the absolute position of depth through multi-view projection features. Extensive experiments on DTU and BlendedMVS demonstrate that our method outperforms state-of-the-art methods with a speedup of 60x to 200x, achieving swift and fine-grained mesh reconstruction without the need for costly pre-training.

FatesGS: Fast and Accurate Sparse-View Surface Reconstruction using Gaussian Splatting with Depth-Feature Consistency

TL;DR

FatesGS tackles sparse-view surface reconstruction by extending Gaussian Splatting with two core ideas: intra-view depth consistency via patch-based monocular depth ranking and depth smoothing, and multi-view feature alignment to enforce cross-view coherence of depth-rendered points. It converts 3D Gaussians to 2D ellipses on a local tangent plane, rendering with a splatting pipeline and optimizing depth and color through learnable parameters. The method achieves state-of-the-art results on DTU and BlendedMVS with 60x–200x speedups and eliminates the need for long per-scene optimization or large-scale pre-training, demonstrating fast, fine-grained mesh reconstruction from sparse views. Overall, FatesGS provides a practical, efficient solution for sparse-view 3D reconstruction with robust geometric accuracy and high rendering fidelity.

Abstract

Recently, Gaussian Splatting has sparked a new trend in the field of computer vision. Apart from novel view synthesis, it has also been extended to the area of multi-view reconstruction. The latest methods facilitate complete, detailed surface reconstruction while ensuring fast training speed. However, these methods still require dense input views, and their output quality significantly degrades with sparse views. We observed that the Gaussian primitives tend to overfit the few training views, leading to noisy floaters and incomplete reconstruction surfaces. In this paper, we present an innovative sparse-view reconstruction framework that leverages intra-view depth and multi-view feature consistency to achieve remarkably accurate surface reconstruction. Specifically, we utilize monocular depth ranking information to supervise the consistency of depth distribution within patches and employ a smoothness loss to enhance the continuity of the distribution. To achieve finer surface reconstruction, we optimize the absolute position of depth through multi-view projection features. Extensive experiments on DTU and BlendedMVS demonstrate that our method outperforms state-of-the-art methods with a speedup of 60x to 200x, achieving swift and fine-grained mesh reconstruction without the need for costly pre-training.
Paper Structure (24 sections, 18 equations, 5 figures, 5 tables)

This paper contains 24 sections, 18 equations, 5 figures, 5 tables.

Figures (5)

  • Figure 1: Surface reconstruction from 3-view images of DTU scan 24. The trendiest general method 2DGS 2dgs is fast but yields coarse results. The state-of-the-art per-scene optimization method, NeuSurf huang2024neusurf, and the generalization method, UFORecon na2024uforecon, produce suboptimal surfaces and require long training time. In contrast, our method (FatesGS) achieves swift and detailed reconstruction. *Pre-training time.
  • Figure 2: Overview of FatesGS. Starting with a set of sparse input views, we initialize 2D Gaussians using COLMAP and employ splatting to render RGB images and depth maps. To enhance the geometric learning process, we integrate ranking information from monocular depth estimation and apply depth smoothing to ensure intra-view depth consistency. To further refine the geometry, we align the multi-view features extracted by projecting estimated surface points onto the source images.
  • Figure 3: Visual comparison of 3-view reconstruction on BlendedMVS dataset.
  • Figure 4: Qualitative comparison of reconstruction results on the DTU with different sparse settings.
  • Figure 5: Visual comparison of ablation study on DTU scan 83. The transition of the error maps from blue to yellow indicates larger reconstruction errors.