Table of Contents
Fetching ...

Environment-Driven Online LiDAR-Camera Extrinsic Calibration

Zhiwei Huang, Jiaqi Li, Hongbo Zhao, Xiao Ma, Ping Zhong, Xiaohu Zhou, Wei Ye, Rui Fan

TL;DR

This work tackles online LiDAR-Camera extrinsic calibration in unstructured environments by proposing EdO-LCEC, an environment-driven framework that adapts to scene feature density. A generalizable scene discriminator creates multiple virtual cameras to enrich cross-modal features, while Dual-Path Correspondence Matching (DPCM) leverages both structural and textural cues to generate dense 3D-2D correspondences. These correspondences are fused through a multi-view and multi-scene optimization to robustly estimate the extrinsic transform $^{C}_{L}oldsymbol{T}$, improving accuracy in sparse and limited overlap conditions. Extensive experiments on KITTI, KITTI360, nuScenes, and MIAS-LCEC demonstrate state-of-the-art performance and strong robustness, with ablations confirming the contributions of the scene discriminator and DPCM. The approach offers practical impact for reliable multi-modal fusion in autonomous systems operating in diverse real-world environments.

Abstract

LiDAR-camera extrinsic calibration (LCEC) is crucial for multi-modal data fusion in autonomous robotic systems. Existing methods, whether target-based or target-free, typically rely on customized calibration targets or fixed scene types, which limit their applicability in real-world scenarios. To address these challenges, we present EdO-LCEC, the first environment-driven online calibration approach. Unlike traditional target-free methods, EdO-LCEC employs a generalizable scene discriminator to estimate the feature density of the application environment. Guided by this feature density, EdO-LCEC extracts LiDAR intensity and depth features from varying perspectives to achieve higher calibration accuracy. To overcome the challenges of cross-modal feature matching between LiDAR and camera, we introduce dual-path correspondence matching (DPCM), which leverages both structural and textural consistency for reliable 3D-2D correspondences. Furthermore, we formulate the calibration process as a joint optimization problem that integrates global constraints across multiple views and scenes, thereby enhancing overall accuracy. Extensive experiments on real-world datasets demonstrate that EdO-LCEC outperforms state-of-the-art methods, particularly in scenarios involving sparse point clouds or partially overlapping sensor views.

Environment-Driven Online LiDAR-Camera Extrinsic Calibration

TL;DR

This work tackles online LiDAR-Camera extrinsic calibration in unstructured environments by proposing EdO-LCEC, an environment-driven framework that adapts to scene feature density. A generalizable scene discriminator creates multiple virtual cameras to enrich cross-modal features, while Dual-Path Correspondence Matching (DPCM) leverages both structural and textural cues to generate dense 3D-2D correspondences. These correspondences are fused through a multi-view and multi-scene optimization to robustly estimate the extrinsic transform , improving accuracy in sparse and limited overlap conditions. Extensive experiments on KITTI, KITTI360, nuScenes, and MIAS-LCEC demonstrate state-of-the-art performance and strong robustness, with ablations confirming the contributions of the scene discriminator and DPCM. The approach offers practical impact for reliable multi-modal fusion in autonomous systems operating in diverse real-world environments.

Abstract

LiDAR-camera extrinsic calibration (LCEC) is crucial for multi-modal data fusion in autonomous robotic systems. Existing methods, whether target-based or target-free, typically rely on customized calibration targets or fixed scene types, which limit their applicability in real-world scenarios. To address these challenges, we present EdO-LCEC, the first environment-driven online calibration approach. Unlike traditional target-free methods, EdO-LCEC employs a generalizable scene discriminator to estimate the feature density of the application environment. Guided by this feature density, EdO-LCEC extracts LiDAR intensity and depth features from varying perspectives to achieve higher calibration accuracy. To overcome the challenges of cross-modal feature matching between LiDAR and camera, we introduce dual-path correspondence matching (DPCM), which leverages both structural and textural consistency for reliable 3D-2D correspondences. Furthermore, we formulate the calibration process as a joint optimization problem that integrates global constraints across multiple views and scenes, thereby enhancing overall accuracy. Extensive experiments on real-world datasets demonstrate that EdO-LCEC outperforms state-of-the-art methods, particularly in scenarios involving sparse point clouds or partially overlapping sensor views.

Paper Structure

This paper contains 30 sections, 21 equations, 12 figures, 10 tables, 1 algorithm.

Figures (12)

  • Figure 1: Visualization of calibration results through LiDAR and camera data fusion in KITTI odometry 00 sequence: (a)-(i) zoomed-in regions that illustrate the alignment between the camera images and the LiDAR point clouds.
  • Figure 2: The pipeline of our proposed EdO-LCEC. The working environment of the sensors is analyzed by the generalizable scene discriminator. In each calibration scene, the feature density of LiDAR and camera data is estimated by image segmentation and depth estimation. Based on this feature density, the scene discriminator generates multiple depth and intensity virtual cameras to create LIP and LDP images. Image segmentation results (segmented masks with corner points) of virtual images and camera images are sent to DPCM to obtain dense 3D-2D correspondences, which serve as input for the multi-view and multi-scene joint optimization to derive and refine the extrinsic matrix between LiDAR and camera.
  • Figure 3: Virtual camera generation method. We distribute the virtual cameras along the $X$, $Y$, and $Z$ axes inside a sphere of radius $r = 0.3\,\mathrm{m}$. The scene discriminator dynamically determines the number of virtual cameras based on feature density. All cameras maintain a default front-facing orientation.
  • Figure 4: DPCM utilizes structural consistency and textural consistency in (\ref{['eq.adaptive_cost_func']}) to compute matching cost between corner points detected from different segmented masks.
  • Figure 5: Visualization of EdO-LCEC calibration results through LiDAR and camera data fusion: (a)-(b) illustrate two LiDAR point clouds in MIAS-LCEC-TF70, partially rendered by the image color using the estimated extrinsic matrix of EdO-LCEC.
  • ...and 7 more figures