Table of Contents
Fetching ...

FSGS: Real-Time Few-shot View Synthesis using Gaussian Splatting

Zehao Zhu, Zhiwen Fan, Yifan Jiang, Zhangyang Wang

TL;DR

FSGS tackles real-time novel view synthesis from extremely sparse observations by extending 3D Gaussian Splatting with Proximity-guided Gaussian Unpooling to densify the scene from sparse SfM points. It couples this geometric densification with pseudo-view augmentation and monocular depth priors, plus differentiable depth rasterization, to regularize and steer Gaussian optimization. The method achieves state-of-the-art performance on multiple few-shot benchmarks (LLFF, Mip-NeRF360, Blender, Shiny) while delivering real-time rendering speeds (>200 FPS) with as few as three training views. This combination of dense geometric coverage and fast rendering makes FSGS highly practical for real-world few-shot view synthesis tasks.

Abstract

Novel view synthesis from limited observations remains an important and persistent task. However, high efficiency in existing NeRF-based few-shot view synthesis is often compromised to obtain an accurate 3D representation. To address this challenge, we propose a few-shot view synthesis framework based on 3D Gaussian Splatting that enables real-time and photo-realistic view synthesis with as few as three training views. The proposed method, dubbed FSGS, handles the extremely sparse initialized SfM points with a thoughtfully designed Gaussian Unpooling process. Our method iteratively distributes new Gaussians around the most representative locations, subsequently infilling local details in vacant areas. We also integrate a large-scale pre-trained monocular depth estimator within the Gaussians optimization process, leveraging online augmented views to guide the geometric optimization towards an optimal solution. Starting from sparse points observed from limited input viewpoints, our FSGS can accurately grow into unseen regions, comprehensively covering the scene and boosting the rendering quality of novel views. Overall, FSGS achieves state-of-the-art performance in both accuracy and rendering efficiency across diverse datasets, including LLFF, Mip-NeRF360, and Blender. Project website: https://zehaozhu.github.io/FSGS/.

FSGS: Real-Time Few-shot View Synthesis using Gaussian Splatting

TL;DR

FSGS tackles real-time novel view synthesis from extremely sparse observations by extending 3D Gaussian Splatting with Proximity-guided Gaussian Unpooling to densify the scene from sparse SfM points. It couples this geometric densification with pseudo-view augmentation and monocular depth priors, plus differentiable depth rasterization, to regularize and steer Gaussian optimization. The method achieves state-of-the-art performance on multiple few-shot benchmarks (LLFF, Mip-NeRF360, Blender, Shiny) while delivering real-time rendering speeds (>200 FPS) with as few as three training views. This combination of dense geometric coverage and fast rendering makes FSGS highly practical for real-world few-shot view synthesis tasks.

Abstract

Novel view synthesis from limited observations remains an important and persistent task. However, high efficiency in existing NeRF-based few-shot view synthesis is often compromised to obtain an accurate 3D representation. To address this challenge, we propose a few-shot view synthesis framework based on 3D Gaussian Splatting that enables real-time and photo-realistic view synthesis with as few as three training views. The proposed method, dubbed FSGS, handles the extremely sparse initialized SfM points with a thoughtfully designed Gaussian Unpooling process. Our method iteratively distributes new Gaussians around the most representative locations, subsequently infilling local details in vacant areas. We also integrate a large-scale pre-trained monocular depth estimator within the Gaussians optimization process, leveraging online augmented views to guide the geometric optimization towards an optimal solution. Starting from sparse points observed from limited input viewpoints, our FSGS can accurately grow into unseen regions, comprehensively covering the scene and boosting the rendering quality of novel views. Overall, FSGS achieves state-of-the-art performance in both accuracy and rendering efficiency across diverse datasets, including LLFF, Mip-NeRF360, and Blender. Project website: https://zehaozhu.github.io/FSGS/.
Paper Structure (36 sections, 8 equations, 10 figures, 7 tables, 1 algorithm)

This paper contains 36 sections, 8 equations, 10 figures, 7 tables, 1 algorithm.

Figures (10)

  • Figure 1: Real-Time Few-shot Novel View Synthesis. We present a point-based framework that is initialized from extremely sparse SfM points, achieving a significantly faster rendering speed (2900$\times$) while enhancing the visual quality (from 0.684 to 0.745, in SSIM) compared to the previous SparseNeRF wang2023sparsenerf.
  • Figure 2: FSGS Pipeline. 3D Gaussians are initialized from COLMAP, with a few images (black cameras). For the sparsely placed Gaussians, we propose densifying new Gaussians to enhance scene coverage by unpooling existing Gaussians into new ones, with properly initialized Gaussian attributes. Monocular depth priors, enhanced by sampling unobserved views (red cameras), guide the optimization of grown Gaussians towards a reasonable geometry. The final loss consists of a photometric loss term, and a geometric regularization term calculated as depth relative correspondence.
  • Figure 3: Points Sparsity vs. Synthesized Quality. The SfM points from COLMAP using 3-views (Bottom Left) is significantly sparse than full-view(Top Left). 3D-GS with sparse SfM points will decrease its quality when the training view number decreases.
  • Figure 4: Gaussian Unpooling Illustration. We show a 2D toy case for visualizing Gaussian Unpooling with depth guidance, where the example 1D depth provides priors on the relative distance of the Gaussians from the viewing direction, guide the Gaussian deformation toward a better solution.
  • Figure 5: Qualitative Results on LLFF Datasets. We demonstrate novel view results produced by 3D-GS kerbl20233d, Mip-NeRF360 Barron2022MipNeRF3U, SparseNeRF wang2023sparsenerf and our approach for comparison. We can observe that NeRF-based methods generate floaters (Scene: Flower) and show aliasing results (Scene: Leaves) due to limited observation. 3D-GS produces oversmoothed results, caused by overfitting on training views. Our method produces pleasing appearances while demonstrating detailed thin structures.
  • ...and 5 more figures