Table of Contents
Fetching ...

LoopSparseGS: Loop Based Sparse-View Friendly Gaussian Splatting

Zhenyu Bao, Guibiao Liao, Kaichen Zhou, Kanglin Liu, Qing Li, Guoping Qiu

TL;DR

LoopSparseGS addresses sparse-input novel view synthesis for 3D Gaussian Splatting by introducing a loop-based Progressive Gaussian Initialization that densifies initial point clouds using pseudo-views, a Depth-alignment Regularization that fuses sparse SfM depth with dense monocular depth via a sliding-window loss, and a Sparse-friendly Sampling strategy that splits oversized Gaussians guided by pixel error. These components collectively provide denser geometry, more reliable depth supervision, and improved handling of large Gaussians, yielding state-of-the-art performance on four datasets (indoor, outdoor, and object-level) across multiple resolutions. The method demonstrates robust improvements over existing approaches in PSNR, SSIM, and perceptual metrics, while maintaining efficient rendering speeds. This work advances sparse-view NVS by integrating loop-based initialization, depth alignment, and adaptive Gaussian densification to produce photorealistic results with limited input views, enabling more practical deployment in real-world scenarios.

Abstract

Despite the photorealistic novel view synthesis (NVS) performance achieved by the original 3D Gaussian splatting (3DGS), its rendering quality significantly degrades with sparse input views. This performance drop is mainly caused by the limited number of initial points generated from the sparse input, insufficient supervision during the training process, and inadequate regularization of the oversized Gaussian ellipsoids. To handle these issues, we propose the LoopSparseGS, a loop-based 3DGS framework for the sparse novel view synthesis task. In specific, we propose a loop-based Progressive Gaussian Initialization (PGI) strategy that could iteratively densify the initialized point cloud using the rendered pseudo images during the training process. Then, the sparse and reliable depth from the Structure from Motion, and the window-based dense monocular depth are leveraged to provide precise geometric supervision via the proposed Depth-alignment Regularization (DAR). Additionally, we introduce a novel Sparse-friendly Sampling (SFS) strategy to handle oversized Gaussian ellipsoids leading to large pixel errors. Comprehensive experiments on four datasets demonstrate that LoopSparseGS outperforms existing state-of-the-art methods for sparse-input novel view synthesis, across indoor, outdoor, and object-level scenes with various image resolutions.

LoopSparseGS: Loop Based Sparse-View Friendly Gaussian Splatting

TL;DR

LoopSparseGS addresses sparse-input novel view synthesis for 3D Gaussian Splatting by introducing a loop-based Progressive Gaussian Initialization that densifies initial point clouds using pseudo-views, a Depth-alignment Regularization that fuses sparse SfM depth with dense monocular depth via a sliding-window loss, and a Sparse-friendly Sampling strategy that splits oversized Gaussians guided by pixel error. These components collectively provide denser geometry, more reliable depth supervision, and improved handling of large Gaussians, yielding state-of-the-art performance on four datasets (indoor, outdoor, and object-level) across multiple resolutions. The method demonstrates robust improvements over existing approaches in PSNR, SSIM, and perceptual metrics, while maintaining efficient rendering speeds. This work advances sparse-view NVS by integrating loop-based initialization, depth alignment, and adaptive Gaussian densification to produce photorealistic results with limited input views, enabling more practical deployment in real-world scenarios.

Abstract

Despite the photorealistic novel view synthesis (NVS) performance achieved by the original 3D Gaussian splatting (3DGS), its rendering quality significantly degrades with sparse input views. This performance drop is mainly caused by the limited number of initial points generated from the sparse input, insufficient supervision during the training process, and inadequate regularization of the oversized Gaussian ellipsoids. To handle these issues, we propose the LoopSparseGS, a loop-based 3DGS framework for the sparse novel view synthesis task. In specific, we propose a loop-based Progressive Gaussian Initialization (PGI) strategy that could iteratively densify the initialized point cloud using the rendered pseudo images during the training process. Then, the sparse and reliable depth from the Structure from Motion, and the window-based dense monocular depth are leveraged to provide precise geometric supervision via the proposed Depth-alignment Regularization (DAR). Additionally, we introduce a novel Sparse-friendly Sampling (SFS) strategy to handle oversized Gaussian ellipsoids leading to large pixel errors. Comprehensive experiments on four datasets demonstrate that LoopSparseGS outperforms existing state-of-the-art methods for sparse-input novel view synthesis, across indoor, outdoor, and object-level scenes with various image resolutions.
Paper Structure (21 sections, 9 equations, 10 figures, 10 tables)

This paper contains 21 sections, 9 equations, 10 figures, 10 tables.

Figures (10)

  • Figure 1: In scenarios with limited input data, the standard 3D Gaussian Splatting (3DGS) method generates insufficient points and minimal depth constraints for training. Our LoopSparseGS employs additional pseudo-cameras to produce more comprehensive initialization points and richer depth information for 3DGS training. Additionally, we found that the excessively large ellipsoids, damage view rendering quality through blurring. To mitigate the issue, we propose a Sparse Friendly Sampling (SFS) strategy to split oversized ellipsoids. The results are presented in the third column, demonstrating the effectiveness of our method.
  • Figure 2: Overview of the proposed LoopSparseGS. The LoopSparseGS features three key components: Progressive Gaussian Initialization, Depth Alignment Regulerizer and Sparse-friendly sampling. Progressive Gaussian Initialization leverages the training view and high-quality pseudo views near the training view to increase the number of the Gaussian initialized points. Depth Alignment Regularizer incorporates the precise SFM depth and monocular depth and provides a sliding window-based manner to align the two scale-invariant depth regularizers. Sparse-friendly sampling slit large Gaussian ellipsoids of large pixels errors to enhance the representation capacity of large pixel areas.
  • Figure 3: Illustration of the rendered RGB and depth maps without using filter strategy ("w/o" Filter) and utilizing filter strategy ("w" Filter). (a) Rendered image of Horns. (b) Rendered depth of Horns. (c) Rendered image of leaves. (d) Rendered depth of leaves. Without filtering, the rendered depth shows significant holes in the edges of the horn and leaves, while the holes are filled up when using our custom filter strategy.
  • Figure 4: Illustration of the rendered depth maps using different depth supervision. (a) GT image. (b) Using SfM-derived depth supervision. (c) Using SfM-derived depth and Monocular depth supervision without depth alignment. (d) Using SfM-derived depth and Monocular depth with our depth-alignment strategy.
  • Figure 5: Illustration of sliding window-based sampling strategy in DAR. (a) Rendered depth map. (b) Monocular depth map provided by Midas. Our method begins by sliding a window to obtain the rendering depth and mono depth with the specified window size. Instead of computing the Pearson loss over the whole image, we compute the region of sliding window areas, which enlarge the Pearson loss in the misaligned regions between SfM-derived depth and mono-depth, as illustrated in the blue box of (a) and the green box of (b) of the middle area.
  • ...and 5 more figures