Table of Contents
Fetching ...

Stereo-LiDAR Fusion by Semi-Global Matching With Discrete Disparity-Matching Cost and Semidensification

Yasuhiro Yao, Ryoichi Ishikawa, Takeshi Oishi

TL;DR

The paper tackles real-time, accurate depth estimation by fusing stereo cameras with LiDAR data without learning-based training. It introduces a non-learning pipeline built on Semi-Global Matching (SGM) augmented with Discrete Disparity-matching Cost (DDC), semidensification of LiDAR priors, and a three-view stereo-LiDAR consistency check, implemented on GPU. On KITTI 141, it achieves a total error of $2.79\%$, surpassing prior real-time non-learning methods, and shows robustness to LiDAR density, weather, and indoor scenes. The approach offers practical benefits for robotics due to real-time performance and domain adaptability without training.

Abstract

We present a real-time, non-learning depth estimation method that fuses Light Detection and Ranging (LiDAR) data with stereo camera input. Our approach comprises three key techniques: Semi-Global Matching (SGM) stereo with Discrete Disparity-matching Cost (DDC), semidensification of LiDAR disparity, and a consistency check that combines stereo images and LiDAR data. Each of these components is designed for parallelization on a GPU to realize real-time performance. When it was evaluated on the KITTI dataset, the proposed method achieved an error rate of 2.79\%, outperforming the previous state-of-the-art real-time stereo-LiDAR fusion method, which had an error rate of 3.05\%. Furthermore, we tested the proposed method in various scenarios, including different LiDAR point densities, varying weather conditions, and indoor environments, to demonstrate its high adaptability. We believe that the real-time and non-learning nature of our method makes it highly practical for applications in robotics and automation.

Stereo-LiDAR Fusion by Semi-Global Matching With Discrete Disparity-Matching Cost and Semidensification

TL;DR

The paper tackles real-time, accurate depth estimation by fusing stereo cameras with LiDAR data without learning-based training. It introduces a non-learning pipeline built on Semi-Global Matching (SGM) augmented with Discrete Disparity-matching Cost (DDC), semidensification of LiDAR priors, and a three-view stereo-LiDAR consistency check, implemented on GPU. On KITTI 141, it achieves a total error of , surpassing prior real-time non-learning methods, and shows robustness to LiDAR density, weather, and indoor scenes. The approach offers practical benefits for robotics due to real-time performance and domain adaptability without training.

Abstract

We present a real-time, non-learning depth estimation method that fuses Light Detection and Ranging (LiDAR) data with stereo camera input. Our approach comprises three key techniques: Semi-Global Matching (SGM) stereo with Discrete Disparity-matching Cost (DDC), semidensification of LiDAR disparity, and a consistency check that combines stereo images and LiDAR data. Each of these components is designed for parallelization on a GPU to realize real-time performance. When it was evaluated on the KITTI dataset, the proposed method achieved an error rate of 2.79\%, outperforming the previous state-of-the-art real-time stereo-LiDAR fusion method, which had an error rate of 3.05\%. Furthermore, we tested the proposed method in various scenarios, including different LiDAR point densities, varying weather conditions, and indoor environments, to demonstrate its high adaptability. We believe that the real-time and non-learning nature of our method makes it highly practical for applications in robotics and automation.

Paper Structure

This paper contains 19 sections, 6 equations, 16 figures, 7 tables.

Figures (16)

  • Figure 1: Flow chart of the proposed method. The semidensification process takes stereo images and the sparse disparity map ($\bar{D}$) and outputs the semidense disparity map ($\hat{D}$). SGM with DDC takes stereo images with either a semidense disparity map ($\hat{D}$) or a sparse disparity map ($\bar{D}$) and outputs a dense disparity map. The stereo-LiDAR consistency check annotates invalid disparities based on the consistency of the three views to obtain a consistent dense disparity map ($D^{c*}$).
  • Figure 2: Dataset and results of KITTI 141 evaluation. Overall, offline learning-based LSNet cheng2019noise showed the least error, as seen on the car in the error maps. SSM-TGV yao2021non showed a more significant error than our methods, as seen on the car roof in the error map.
  • Figure 3: Images
  • Figure 4: LSNetcheng2019noise
  • Figure 5: SSM- TGVyao2021non
  • ...and 11 more figures