Table of Contents
Fetching ...

DNGaussian: Optimizing Sparse-View 3D Gaussian Radiance Fields with Global-Local Depth Normalization

Jiahe Li, Jiawei Zhang, Xiao Bai, Jin Zheng, Xin Ning, Jun Zhou, Lin Gu

TL;DR

DNGaussian is introduced, a depth-regularized framework based on 3D Gaussian radiance fields, offering real-time and high-quality few-shot novel view synthesis at low costs, and a Hard and Soft Depth Regularization to restore accurate scene geometry under coarse monocular depth supervision while maintaining a fine-grained color appearance.

Abstract

Radiance fields have demonstrated impressive performance in synthesizing novel views from sparse input views, yet prevailing methods suffer from high training costs and slow inference speed. This paper introduces DNGaussian, a depth-regularized framework based on 3D Gaussian radiance fields, offering real-time and high-quality few-shot novel view synthesis at low costs. Our motivation stems from the highly efficient representation and surprising quality of the recent 3D Gaussian Splatting, despite it will encounter a geometry degradation when input views decrease. In the Gaussian radiance fields, we find this degradation in scene geometry primarily lined to the positioning of Gaussian primitives and can be mitigated by depth constraint. Consequently, we propose a Hard and Soft Depth Regularization to restore accurate scene geometry under coarse monocular depth supervision while maintaining a fine-grained color appearance. To further refine detailed geometry reshaping, we introduce Global-Local Depth Normalization, enhancing the focus on small local depth changes. Extensive experiments on LLFF, DTU, and Blender datasets demonstrate that DNGaussian outperforms state-of-the-art methods, achieving comparable or better results with significantly reduced memory cost, a $25 \times$ reduction in training time, and over $3000 \times$ faster rendering speed.

DNGaussian: Optimizing Sparse-View 3D Gaussian Radiance Fields with Global-Local Depth Normalization

TL;DR

DNGaussian is introduced, a depth-regularized framework based on 3D Gaussian radiance fields, offering real-time and high-quality few-shot novel view synthesis at low costs, and a Hard and Soft Depth Regularization to restore accurate scene geometry under coarse monocular depth supervision while maintaining a fine-grained color appearance.

Abstract

Radiance fields have demonstrated impressive performance in synthesizing novel views from sparse input views, yet prevailing methods suffer from high training costs and slow inference speed. This paper introduces DNGaussian, a depth-regularized framework based on 3D Gaussian radiance fields, offering real-time and high-quality few-shot novel view synthesis at low costs. Our motivation stems from the highly efficient representation and surprising quality of the recent 3D Gaussian Splatting, despite it will encounter a geometry degradation when input views decrease. In the Gaussian radiance fields, we find this degradation in scene geometry primarily lined to the positioning of Gaussian primitives and can be mitigated by depth constraint. Consequently, we propose a Hard and Soft Depth Regularization to restore accurate scene geometry under coarse monocular depth supervision while maintaining a fine-grained color appearance. To further refine detailed geometry reshaping, we introduce Global-Local Depth Normalization, enhancing the focus on small local depth changes. Extensive experiments on LLFF, DTU, and Blender datasets demonstrate that DNGaussian outperforms state-of-the-art methods, achieving comparable or better results with significantly reduced memory cost, a reduction in training time, and over faster rendering speed.
Paper Structure (22 sections, 12 equations, 15 figures, 11 tables)

This paper contains 22 sections, 12 equations, 15 figures, 11 tables.

Figures (15)

  • Figure 1: Comparison of the state-of-the-arts FreeNeRF yang2023freenerf and SparseNeRF wang2023sparsenerf with our DNGaussian utilizing three views for training. DNGaussian stands out by delivering comparably high-quality synthesized views and superior details with a remarkable 25× reduction in time and significantly lower memory overhead during training, while attaining the fastest and the only real-time rendering speed of 300 FPS. The point cloud of Gaussians illustrates the detailed and explainable spatial representation learned through our method.
  • Figure 2: 3D Gaussian Splatting kerbl20233d exhibits its potential to reconstruct some fine details (green box) from sparse input views. Nevertheless, the reduced input views would significantly degrade geometry and cause failed reconstruction (orange box). After applying depth regularization, DNGaussian successfully recovers accurate geometry and synthesizes high-quality novel views.
  • Figure 3: The Framework of DNGaussian. Our framework starts from a random initialization and consists of a Color Supervision module and a Depth Regularization module. The optimization process of Color Supervision mainly inherits from 3D Gaussian Splatting kerbl20233d except for a Neural Color Renderer. In the depth regularization, we render a Hard Depth and a Soft Depth for the input view, and separately calculate the losses of the pre-generated monocular depth map with the proposed Global-Local Depth Normalization. Finally, the output Gaussian field enables efficient and high-quality novel view synthesis.
  • Figure 4: A fixed global scale pays little attention to the small depth errors even under L1 loss, which leads to noisy primitives and causes failures in novel view (yellow box). Our Global-Local Depth Normalization refocuses on small errors via local scale and helps reconstruct a more accurate appearance (green box).
  • Figure 5: Qualitative Comparison on LLFF. 3DGS kerbl20233d fails to synthesize accurate novel views under sparse inputs. The rendering views from FreeNeRF yang2023freenerf and SparseNeRF wang2023sparsenerf are both smooth but with too many details lost. FreeNeRF further learns a wrong geometry in complete scenes. Our method learns more complete foreground geometry and renders high-quality novel views with fine details.
  • ...and 10 more figures