Mode-GS: Monocular Depth Guided Anchored 3D Gaussian Splatting for Robust Ground-View Scene Rendering
Yonghan Lee, Jaehoon Choi, Dongki Jung, Jaeseong Yun, Soohyun Ryu, Dinesh Manocha, Suyong Yeon
TL;DR
Mode-GS tackles robust novel-view rendering for ground-robot datasets with sparse multi-view data and pose imperfections. It fuses monocular depth-derived pixel-aligned anchors with anchored Gaussian splats and a residual-form Gaussian decoder, together with a scale-consistent depth loss to handle monocular depth ambiguity. The method achieves state-of-the-art rendering performance on the R$^{3}$LIVE odometry dataset and competitive results on Tanks and Temples, notably without relying on LiDAR or dense SfM point clouds. Ablation confirms that depth calibration and the residual decoder enhance training speed and robustness. The approach offers a practical, point-cloud-free pipeline for ground-view rendering with free trajectories, expanding applicability in real-world robotic perception.
Abstract
We present a novel-view rendering algorithm, Mode-GS, for ground-robot trajectory datasets. Our approach is based on using anchored Gaussian splats, which are designed to overcome the limitations of existing 3D Gaussian splatting algorithms. Prior neural rendering methods suffer from severe splat drift due to scene complexity and insufficient multi-view observation, and can fail to fix splats on the true geometry in ground-robot datasets. Our method integrates pixel-aligned anchors from monocular depths and generates Gaussian splats around these anchors using residual-form Gaussian decoders. To address the inherent scale ambiguity of monocular depth, we parameterize anchors with per-view depth-scales and employ scale-consistent depth loss for online scale calibration. Our method results in improved rendering performance, based on PSNR, SSIM, and LPIPS metrics, in ground scenes with free trajectory patterns, and achieves state-of-the-art rendering performance on the R3LIVE odometry dataset and the Tanks and Temples dataset.
