Stereo-LiDAR Depth Estimation with Deformable Propagation and Learned Disparity-Depth Conversion

Ang Li; Anning Hu; Wei Xi; Wenxian Yu; Danping Zou

Stereo-LiDAR Depth Estimation with Deformable Propagation and Learned Disparity-Depth Conversion

Ang Li, Anning Hu, Wei Xi, Wenxian Yu, Danping Zou

TL;DR

This work tackles dense depth estimation by leveraging sparse LiDAR hints through a deformable propagation approach that creates semi-dense guidance and a confidence map, followed by a learned disparity-depth conversion to mitigate triangulation errors which grow quadratically with distance. The SDG-Depth architecture fuses a Deformable Propagation (DP) module, a Confidence-based Gaussian (CG) modulation, a coarse-to-fine 3D CNN, and a Disparity-Depth Conversion (DDC) module to produce accurate, dense depth maps efficiently. The method achieves state-of-the-art performance on KITTI depth completion and competitive results on synthetic Virtual KITTI2 and real MS2 data, with notable improvements for distant objects and boundary regions. By integrating global-aware hint propagation and edge-aware depth refinement, the approach offers a practical, scalable solution for stereo-LiDAR perception in autonomous driving, with potential impact on perception accuracy and runtime efficiency, as reflected by the $O(D^2)$ triangulation error behavior and effective cost-volume modulation.

Abstract

Accurate and dense depth estimation with stereo cameras and LiDAR is an important task for automatic driving and robotic perception. While sparse hints from LiDAR points have improved cost aggregation in stereo matching, their effectiveness is limited by the low density and non-uniform distribution. To address this issue, we propose a novel stereo-LiDAR depth estimation network with Semi-Dense hint Guidance, named SDG-Depth. Our network includes a deformable propagation module for generating a semi-dense hint map and a confidence map by propagating sparse hints using a learned deformable window. These maps then guide cost aggregation in stereo matching. To reduce the triangulation error in depth recovery from disparity, especially in distant regions, we introduce a disparity-depth conversion module. Our method is both accurate and efficient. The experimental results on benchmark tests show its superior performance. Our code is available at https://github.com/SJTU-ViSYS/SDG-Depth.

Stereo-LiDAR Depth Estimation with Deformable Propagation and Learned Disparity-Depth Conversion

TL;DR

triangulation error behavior and effective cost-volume modulation.

Abstract

Paper Structure (18 sections, 9 equations, 6 figures, 5 tables)

This paper contains 18 sections, 9 equations, 6 figures, 5 tables.

INTRODUCTION
RELATED WORKS
Stereo Matching
Monocular Depth Completion
Stereo-LiDAR Fusion
Method
Deformable Propagation (DP) Module
Confidence-based Gaussian Module
Coarse-to-fine 3D CNN
Disparity-Depth Conversion (DDC) Module
Loss Function
EXPERIMENTS
Datasets
Implementation Details
Benchmark Evaluation
...and 3 more sections

Figures (6)

Figure 1: Our network achieves the best trade off in accuracy and inference speed on the KITTI Completion dataset. Green, blue, and red marks represent results from stereo matching, monocular depth completion, and stereo-LiDAR fusion methods, respectively.
Figure 2: The architecture of our proposed network. Firstly, the disparity Deformable Propagation (DP) module propagates sparse LiDAR within varying-shaped windows to semi-dense disparity. Based on the generated disparity map and confidence map, the Confidence-based Gaussian (CG) module regulates the cost volume that is constructed from the features of stereo images and expanded disparity. Subsequently, dense disparity is obtained by employing coarse-to-fine 3D CNN on the regulated cost volume. Finally, the learned Disparity-Depth Conversion (DDC) module accurately recovers depth from the disparity of 3D CNN.
Figure 3: Disparity deformable propagation module. The module computes the propagation weight by employing local self-correlation based on the learned 2D offset field and propagates sparse hints within the deformable windows.
Figure 4: Disparity-depth conversion module. The module generates pixel-wise residuals $\delta_1$ and $\delta_2$ in both the disparity and depth space, based on high-frequency features.
Figure 5: Qualitative results on KITTI depth completion dataset uhrig2017sparsity. Our network produces more accurate predictions with smaller depth errors (in blue) and more regular object shapes in distant regions, compared to other state-of-the-art stereo and stereo-LiDAR methods.
...and 1 more figures

Stereo-LiDAR Depth Estimation with Deformable Propagation and Learned Disparity-Depth Conversion

TL;DR

Abstract

Stereo-LiDAR Depth Estimation with Deformable Propagation and Learned Disparity-Depth Conversion

Authors

TL;DR

Abstract

Table of Contents

Figures (6)