Table of Contents
Fetching ...

A re-calibration method for object detection with multi-modal alignment bias in autonomous driving

Zhihang Song, Dingyi Yao, Ruibo Ming, Lihui Peng, Danya Yao, Yi Zhang

TL;DR

This work addresses the vulnerability of multi-modal autonomous driving perception to calibration bias between LiDAR and camera. It introduces a semantic-segmentation-guided re-calibration framework that ingests LiDAR, images, and the initial extrinsic matrix to predict corrected calibration parameters, enabling more robust object detection with EPNet++. The approach demonstrates significant performance gains under calibration perturbations and offers faster runtime than existing calibration methods, highlighting practical value for real-world deployment. The method is modular and compatible with various detection pipelines, and future work points toward joint spatio-temporal recalibration and uncertainty-aware enhancements.

Abstract

Multi-modal object detection in autonomous driving has achieved great breakthroughs due to the usage of fusing complementary information from different sensors. The calibration in fusion between sensors such as LiDAR and camera was always supposed to be precise in previous work. However, in reality, calibration matrices are fixed when the vehicles leave the factory, but mechanical vibration, road bumps, and data lags may cause calibration bias. As there is relatively limited research on the impact of calibration on fusion detection performance, multi-sensor detection methods with flexible calibration dependency have remained a key objective. In this paper, we systematically evaluate the sensitivity of the SOTA EPNet++ detection framework and prove that even slight bias on calibration can reduce the performance seriously. To address this vulnerability, we propose a re-calibration model to re-calibrate the misalignment in detection tasks. This model integrates LiDAR point cloud, camera image, and initial calibration matrix as inputs, generating re-calibrated bias through semantic segmentation guidance and a tailored loss function design. The re-calibration model can operate with existing detection algorithms, enhancing both robustness against calibration bias and overall object detection performance. Our approach establishes a foundational methodology for maintaining reliability in multi-modal perception systems under real-world calibration uncertainties.

A re-calibration method for object detection with multi-modal alignment bias in autonomous driving

TL;DR

This work addresses the vulnerability of multi-modal autonomous driving perception to calibration bias between LiDAR and camera. It introduces a semantic-segmentation-guided re-calibration framework that ingests LiDAR, images, and the initial extrinsic matrix to predict corrected calibration parameters, enabling more robust object detection with EPNet++. The approach demonstrates significant performance gains under calibration perturbations and offers faster runtime than existing calibration methods, highlighting practical value for real-world deployment. The method is modular and compatible with various detection pipelines, and future work points toward joint spatio-temporal recalibration and uncertainty-aware enhancements.

Abstract

Multi-modal object detection in autonomous driving has achieved great breakthroughs due to the usage of fusing complementary information from different sensors. The calibration in fusion between sensors such as LiDAR and camera was always supposed to be precise in previous work. However, in reality, calibration matrices are fixed when the vehicles leave the factory, but mechanical vibration, road bumps, and data lags may cause calibration bias. As there is relatively limited research on the impact of calibration on fusion detection performance, multi-sensor detection methods with flexible calibration dependency have remained a key objective. In this paper, we systematically evaluate the sensitivity of the SOTA EPNet++ detection framework and prove that even slight bias on calibration can reduce the performance seriously. To address this vulnerability, we propose a re-calibration model to re-calibrate the misalignment in detection tasks. This model integrates LiDAR point cloud, camera image, and initial calibration matrix as inputs, generating re-calibrated bias through semantic segmentation guidance and a tailored loss function design. The re-calibration model can operate with existing detection algorithms, enhancing both robustness against calibration bias and overall object detection performance. Our approach establishes a foundational methodology for maintaining reliability in multi-modal perception systems under real-world calibration uncertainties.
Paper Structure (10 sections, 3 equations, 7 figures, 5 tables)

This paper contains 10 sections, 3 equations, 7 figures, 5 tables.

Figures (7)

  • Figure 1: Framework of re-calibration model. Our model takes LiDAR point cloud, image and original calibration matrix as inputs and outputs re-calibrated bias.
  • Figure 2: Projected loss. White points are the correct projected position by label calibration. Green points are projected position by output re-calibration results.
  • Figure 3: Corrupted feature fusion caused by Gaussian noise in calibration.(Left: without noise. Right: with noise.)
  • Figure 4: Corrupted feature fusion caused by point translation in LiDAR.(Left: without translation. Right: with translation.)
  • Figure 5: Comparison of points projection with Gaussian noise in calibration before and after re-calibration.(White: label; Green: before re-calibration; Blue: after re-calibration.)
  • ...and 2 more figures