AugGS: Self-augmented Gaussians with Structural Masks for Sparse-view 3D Reconstruction
Bi'an Du, Lingbei Meng, Wei Hu
TL;DR
This work tackles sparse-view 3D reconstruction by introducing a self-augmented two-stage Gaussian splatting framework. It combines a coarse-to-fine Gaussian model with perceptual data augmentation via a fine-tuned 2D diffusion prior and integrates structure-aware masks to maintain geometry under sparse observations. The approach achieves state-of-the-art perceptual and multi-view consistency on benchmarks like MipNeRF360, OmniObject3D, and OpenIllumination, while notably improving training and inference efficiency. The method demonstrates practical impact by enabling high-fidelity 3D reconstructions from few views with reduced computational demands.
Abstract
Sparse-view 3D reconstruction is a major challenge in computer vision, aiming to create complete three-dimensional models from limited viewing angles. Key obstacles include: 1) a small number of input images with inconsistent information; 2) dependence on input image quality; and 3) large model parameter sizes. To tackle these issues, we propose a self-augmented two-stage Gaussian splatting framework enhanced with structural masks for sparse-view 3D reconstruction. Initially, our method generates a basic 3D Gaussian representation from sparse inputs and renders multi-view images. We then fine-tune a pre-trained 2D diffusion model to enhance these images, using them as augmented data to further optimize the 3D Gaussians. Additionally, a structural masking strategy during training enhances the model's robustness to sparse inputs and noise. Experiments on benchmarks like MipNeRF360, OmniObject3D, and OpenIllumination demonstrate that our approach achieves state-of-the-art performance in perceptual quality and multi-view consistency with sparse inputs.
