AugGS: Self-augmented Gaussians with Structural Masks for Sparse-view 3D Reconstruction

Bi'an Du; Lingbei Meng; Wei Hu

AugGS: Self-augmented Gaussians with Structural Masks for Sparse-view 3D Reconstruction

Bi'an Du, Lingbei Meng, Wei Hu

TL;DR

This work tackles sparse-view 3D reconstruction by introducing a self-augmented two-stage Gaussian splatting framework. It combines a coarse-to-fine Gaussian model with perceptual data augmentation via a fine-tuned 2D diffusion prior and integrates structure-aware masks to maintain geometry under sparse observations. The approach achieves state-of-the-art perceptual and multi-view consistency on benchmarks like MipNeRF360, OmniObject3D, and OpenIllumination, while notably improving training and inference efficiency. The method demonstrates practical impact by enabling high-fidelity 3D reconstructions from few views with reduced computational demands.

Abstract

Sparse-view 3D reconstruction is a major challenge in computer vision, aiming to create complete three-dimensional models from limited viewing angles. Key obstacles include: 1) a small number of input images with inconsistent information; 2) dependence on input image quality; and 3) large model parameter sizes. To tackle these issues, we propose a self-augmented two-stage Gaussian splatting framework enhanced with structural masks for sparse-view 3D reconstruction. Initially, our method generates a basic 3D Gaussian representation from sparse inputs and renders multi-view images. We then fine-tune a pre-trained 2D diffusion model to enhance these images, using them as augmented data to further optimize the 3D Gaussians. Additionally, a structural masking strategy during training enhances the model's robustness to sparse inputs and noise. Experiments on benchmarks like MipNeRF360, OmniObject3D, and OpenIllumination demonstrate that our approach achieves state-of-the-art performance in perceptual quality and multi-view consistency with sparse inputs.

AugGS: Self-augmented Gaussians with Structural Masks for Sparse-view 3D Reconstruction

TL;DR

Abstract

Paper Structure (17 sections, 4 equations, 6 figures, 3 tables)

This paper contains 17 sections, 4 equations, 6 figures, 3 tables.

Introduction
RELATED WORKS
Differentiable Point-based Rendering
Neural Rendering for Sparse View Reconstruction
METHOD
Overview
Two-stage Training Gaussians
Perceptual View data Augmentation
Integration of Structure-aware Masks in Coarse-to-Fine Gaussian Process
Point-based Masks For the First Stage
Patch-based Masks For the Second Stage
EXPERIMENT
Implementation Detail
Dataset
Evaluation
...and 2 more sections

Figures (6)

Figure 1: Our method enables high-quality 3D reconstruction of sparse-view scenes with self-augmented Gaussian splatting, surpassing the current SOTA methods in both qualitative and quantitative aspects for sparse view 3D reconstructions.
Figure 2: The overall architecture of our self-augmented Gaussian splatting method. We first create a coarse 3D Gaussian model from sparse-view images, generating a coarse point cloud and renderings from novel views. Multi-view renders and the 2D prior enhance perceptual quality, with structural masks integrated into the two-stage Gaussian process.
Figure 3: Qualitative examples on the MipNeRF360 and OmniObject3D dataset with 4 input views.
Figure 4: Comparative analysis of PSNR metrics for 4View and 9View configurations across different objects and Gaussian iteration processes. Note: 'K', 'G', and 'B' represent objects Kitchen, Garden, and Bonsai, respectively. 'C' refers to the Coarse Gaussian iteration process, while 'F' denotes the Fine Gaussian iteration process.
Figure 5: Ablation study on different augmentation strategies. “Aug” denotes for augmentation, “PV” denotes for perceptual view augmentation and "M" for mask augmentation.
...and 1 more figures

AugGS: Self-augmented Gaussians with Structural Masks for Sparse-view 3D Reconstruction

TL;DR

Abstract

AugGS: Self-augmented Gaussians with Structural Masks for Sparse-view 3D Reconstruction

Authors

TL;DR

Abstract

Table of Contents

Figures (6)