Table of Contents
Fetching ...

Depth-Regularized Optimization for 3D Gaussian Splatting in Few-Shot Images

Jaeyoung Chung, Jeongtaek Oh, Kyoung Mu Lee

TL;DR

This work tackles the overfitting tendency of 3D Gaussian Splatting when optimized from very few images. It introduces a depth-guided optimization pipeline that leverages a dense depth prior derived from a pretrained monocular depth model, scaled to align with sparse COLMAP points, and integrated via a differentiable depth rasterizer and depth loss. A smoothness constraint and an early-stop strategy further stabilize training in the few-shot regime. On NeRF-LLFF, the method yields substantially improved geometry and rendering quality compared to the original 3DGS, showing practical potential for high-quality 3D reconstructions from minimal imagery.

Abstract

In this paper, we present a method to optimize Gaussian splatting with a limited number of images while avoiding overfitting. Representing a 3D scene by combining numerous Gaussian splats has yielded outstanding visual quality. However, it tends to overfit the training views when only a small number of images are available. To address this issue, we introduce a dense depth map as a geometry guide to mitigate overfitting. We obtained the depth map using a pre-trained monocular depth estimation model and aligning the scale and offset using sparse COLMAP feature points. The adjusted depth aids in the color-based optimization of 3D Gaussian splatting, mitigating floating artifacts, and ensuring adherence to geometric constraints. We verify the proposed method on the NeRF-LLFF dataset with varying numbers of few images. Our approach demonstrates robust geometry compared to the original method that relies solely on images. Project page: robot0321.github.io/DepthRegGS

Depth-Regularized Optimization for 3D Gaussian Splatting in Few-Shot Images

TL;DR

This work tackles the overfitting tendency of 3D Gaussian Splatting when optimized from very few images. It introduces a depth-guided optimization pipeline that leverages a dense depth prior derived from a pretrained monocular depth model, scaled to align with sparse COLMAP points, and integrated via a differentiable depth rasterizer and depth loss. A smoothness constraint and an early-stop strategy further stabilize training in the few-shot regime. On NeRF-LLFF, the method yields substantially improved geometry and rendering quality compared to the original 3DGS, showing practical potential for high-quality 3D reconstructions from minimal imagery.

Abstract

In this paper, we present a method to optimize Gaussian splatting with a limited number of images while avoiding overfitting. Representing a 3D scene by combining numerous Gaussian splats has yielded outstanding visual quality. However, it tends to overfit the training views when only a small number of images are available. To address this issue, we introduce a dense depth map as a geometry guide to mitigate overfitting. We obtained the depth map using a pre-trained monocular depth estimation model and aligning the scale and offset using sparse COLMAP feature points. The adjusted depth aids in the color-based optimization of 3D Gaussian splatting, mitigating floating artifacts, and ensuring adherence to geometric constraints. We verify the proposed method on the NeRF-LLFF dataset with varying numbers of few images. Our approach demonstrates robust geometry compared to the original method that relies solely on images. Project page: robot0321.github.io/DepthRegGS
Paper Structure (17 sections, 8 equations, 5 figures, 3 tables)

This paper contains 17 sections, 8 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: The efficacy of depth regularization in a few-shot setting We optimize Gaussian splats with a limited number of images, avoiding overfitting through the geometry guidance estimated from the images. Please note that we utilized only two images to create this 3D scene.
  • Figure 2: Overview. We optimize the 3D Gaussian splatting kerbl20233d using dense depth maps adjusted with the point clouds obtained from COLMAP schonberger2016structure. By incorporating depth maps to regulate the geometry of the 3D scene, our model successfully reconstructs scenes using a limited number of images.
  • Figure 3: Qualitative comparison in NeRF-LLFF mildenhall2019llff dataset. We visualize the distinction between 3D Gaussian Splatting (3DGS) kerbl20233d and our method in both 2-view and 5-view settings. Driven primarily by color loss, 3DGS struggled to achieve desirable geometry. Our approach consistently established plausible geometric structures with depth guidance, resulting in superior reconstruction outcomes.
  • Figure 4: Details in cropped patches. (a) Input View (b) 3DGS kerbl20233d (c) Ours (d) Ground Truth. Our method produces superior reconstruction results compared to 3DGS kerbl20233d, leveraging additional geometric cues. Our method establishes stable geometry, outperforming 3DGS in reconstruction quality.
  • Figure 5: Example results utilizing pseudo-GT depth (oracle). Accurate depth facilitates high-quality 3D reconstruction, even with a limited number of images. Fine details are perceptible in both RGB and depth.