Geometry-Aware Sparse Depth Sampling for High-Fidelity RGB-D Depth Completion in Robotic Systems
Tony Salloom, Dandi Zhou, Xinhai Sun
TL;DR
The paper tackles the mismatch between realistic sensor sparsity and training-time depth completion by introducing a geometry-aware sparse depth sampling method. It uses PCA-based surface normals to compute a per-pixel depth reliability map and samples sparse depth according to this reliability, replacing uniform random sampling in a diffusion-based depth completion framework (Marigold-DC). Evaluated on NYU Depth v2, the approach yields meaningful RMSE/MAE improvements and clearer edge preservation, especially at mid-range sparse densities, suggesting more faithful sensor-inspired inputs for robotic perception. The work enhances robustness of 3D perception in industrial robotics and points to future integration of semantic priors and uncertainty to further tighten real-world applicability.
Abstract
Accurate three-dimensional perception is essential for modern industrial robotic systems that perform manipulation, inspection, and navigation tasks. RGB-D and stereo vision sensors are widely used for this purpose, but the depth maps they produce are often noisy, incomplete, or biased due to sensor limitations and environmental conditions. Depth completion methods aim to generate dense, reliable depth maps from RGB images and sparse depth input. However, a key limitation in current depth completion pipelines is the unrealistic generation of sparse depth: sparse pixels are typically selected uniformly at random from dense ground-truth depth, ignoring the fact that real sensors exhibit geometry-dependent and spatially nonuniform reliability. In this work, we propose a normal-guided sparse depth sampling strategy that leverages PCA-based surface normal estimation on the RGB-D point cloud to compute a per-pixel depth reliability measure. The sparse depth samples are then drawn according to this reliability distribution. We integrate this sampling method with the Marigold-DC diffusion-based depth completion model and evaluate it on NYU Depth v2 using the standard metrics. Experiments show that our geometry-aware sparse depth improves accuracy, reduces artifacts near edges and discontinuities, and produces more realistic training conditions that better reflect real sensor behavior.
