Table of Contents
Fetching ...

RDG-GS: Relative Depth Guidance with Gaussian Splatting for Real-time Sparse-View 3D Rendering

Chenlu Zhan, Yufei Zhang, Yu Lin, Gaoang Wang, Hongwei Wang

TL;DR

RDG-GS tackles sparse-view 3D rendering by introducing Relative Depth Guidance to refine a Gaussian Splatting representation. It combines refined depth priors that integrate global and local image cues with a relative depth guidance loss that aligns depth-image relationships across patches, plus an adaptive sampling strategy to densify initialization in regions with high training error. The method yields state-of-the-art rendering quality and real-time performance across Mip-NeRF360, LLFF, DTU, and Blender datasets, substantially improving geometry accuracy and texture fidelity under sparse views. This approach mitigates dependence on single-view monocular depth, enhances view-consistent geometry, and offers practical benefits for real-world sparse-view applications.

Abstract

Efficiently synthesizing novel views from sparse inputs while maintaining accuracy remains a critical challenge in 3D reconstruction. While advanced techniques like radiance fields and 3D Gaussian Splatting achieve rendering quality and impressive efficiency with dense view inputs, they suffer from significant geometric reconstruction errors when applied to sparse input views. Moreover, although recent methods leverage monocular depth estimation to enhance geometric learning, their dependence on single-view estimated depth often leads to view inconsistency issues across different viewpoints. Consequently, this reliance on absolute depth can introduce inaccuracies in geometric information, ultimately compromising the quality of scene reconstruction with Gaussian splats. In this paper, we present RDG-GS, a novel sparse-view 3D rendering framework with Relative Depth Guidance based on 3D Gaussian Splatting. The core innovation lies in utilizing relative depth guidance to refine the Gaussian field, steering it towards view-consistent spatial geometric representations, thereby enabling the reconstruction of accurate geometric structures and capturing intricate textures. First, we devise refined depth priors to rectify the coarse estimated depth and insert global and fine-grained scene information to regular Gaussians. Building on this, to address spatial geometric inaccuracies from absolute depth, we propose relative depth guidance by optimizing the similarity between spatially correlated patches of depth and images. Additionally, we also directly deal with the sparse areas challenging to converge by the adaptive sampling for quick densification. Across extensive experiments on Mip-NeRF360, LLFF, DTU, and Blender, RDG-GS demonstrates state-of-the-art rendering quality and efficiency, making a significant advancement for real-world application.

RDG-GS: Relative Depth Guidance with Gaussian Splatting for Real-time Sparse-View 3D Rendering

TL;DR

RDG-GS tackles sparse-view 3D rendering by introducing Relative Depth Guidance to refine a Gaussian Splatting representation. It combines refined depth priors that integrate global and local image cues with a relative depth guidance loss that aligns depth-image relationships across patches, plus an adaptive sampling strategy to densify initialization in regions with high training error. The method yields state-of-the-art rendering quality and real-time performance across Mip-NeRF360, LLFF, DTU, and Blender datasets, substantially improving geometry accuracy and texture fidelity under sparse views. This approach mitigates dependence on single-view monocular depth, enhances view-consistent geometry, and offers practical benefits for real-world sparse-view applications.

Abstract

Efficiently synthesizing novel views from sparse inputs while maintaining accuracy remains a critical challenge in 3D reconstruction. While advanced techniques like radiance fields and 3D Gaussian Splatting achieve rendering quality and impressive efficiency with dense view inputs, they suffer from significant geometric reconstruction errors when applied to sparse input views. Moreover, although recent methods leverage monocular depth estimation to enhance geometric learning, their dependence on single-view estimated depth often leads to view inconsistency issues across different viewpoints. Consequently, this reliance on absolute depth can introduce inaccuracies in geometric information, ultimately compromising the quality of scene reconstruction with Gaussian splats. In this paper, we present RDG-GS, a novel sparse-view 3D rendering framework with Relative Depth Guidance based on 3D Gaussian Splatting. The core innovation lies in utilizing relative depth guidance to refine the Gaussian field, steering it towards view-consistent spatial geometric representations, thereby enabling the reconstruction of accurate geometric structures and capturing intricate textures. First, we devise refined depth priors to rectify the coarse estimated depth and insert global and fine-grained scene information to regular Gaussians. Building on this, to address spatial geometric inaccuracies from absolute depth, we propose relative depth guidance by optimizing the similarity between spatially correlated patches of depth and images. Additionally, we also directly deal with the sparse areas challenging to converge by the adaptive sampling for quick densification. Across extensive experiments on Mip-NeRF360, LLFF, DTU, and Blender, RDG-GS demonstrates state-of-the-art rendering quality and efficiency, making a significant advancement for real-world application.
Paper Structure (36 sections, 15 equations, 12 figures, 13 tables)

This paper contains 36 sections, 15 equations, 12 figures, 13 tables.

Figures (12)

  • Figure 1: (a) General Absolute Depth Method. Most methods li2024dngaussianzhu2023fsgsguo2024depth rely on monocular estimated depth, combining depth regularization and image reconstruction losses to optimize the Gaussian field. However, this approach rely on single-view depth which introduces inconsistency problems and results in erroneous geometric information, resulting in inaccurate geometric scene structures (highlighted in blue boxes). (b) Our Proposed Relative Depth Guidance: By utilizing relatively refined depth with view-consistent spatial geometric information, we compute patch-wise similarity to extract relative geometric cues for solving inconsistency, enabling accurate scene geometry reconstruction and high-quality rendering (highlighted in blue boxes).
  • Figure 2: The network structure of RDG-GS. (A) We obtain the refined depth by optimizing the energy module to insert global and fine-grained scene information into the optimization of Gaussian Splatting. (B) We propose the relative depth guidance by optimizing the similarity between spatially correlated patches of depth and images to overcome the view-inconsistent spatial information caused by the absolute depth and guide scene geometry. (C) We employ adaptive densification by sampling areas with huge training errors for more accurate and quick rendering.
  • Figure 3: Visual comparisons of different 12, 24 training views of RDG-GS (ours) and CoR-GS zhang2025cor on Mip-NeRF360 barron2022mip260.
  • Figure 4: Comparison of RDG-GS with the SOTA works SparseNeRF wang2023sparsenerf and 3D Gaussian Splatting kerbl20233dggs of sparse-view 3D reconstruction with $24$ training views. The proposed RDG-GS has super outperformance in refined depth priors with correct geometric shapes and fine-grained details, as well as the real-time 3D reconstruction of high-quality scenes.
  • Figure 5: More qualitative results of rendered depth in Mip-NeRF360 dataset barron2022mip260 between RDG-GS, 3D-GS kerbl20233dggs, CoR-GS zhang2025cor, and FSGS zhu2023fsgs in generating accurate geometric scenes and high-frequency texture details.
  • ...and 7 more figures