Table of Contents
Fetching ...

LiREC-Net: A Target-Free and Learning-Based Network for LiDAR, RGB, and Event Calibration

Aditya Ranjan Dash, Ramy Battrawy, René Schuster, Didier Stricker

TL;DR

The LiREC-Net is proposed, a target-free, learning-based calibration network that jointly calibrates multiple sensor modality pairs, including LiDAR, RGB, and event data, within a unified framework that achieves competitive performance to bi-modal models and sets a new strong baseline for the tri-modal use case.

Abstract

Advanced autonomous systems rely on multi-sensor fusion for safer and more robust perception. To enable effective fusion, calibrating directly from natural driving scenes (i.e., target-free) with high accuracy is crucial for precise multi-sensor alignment. Existing learning-based calibration methods are typically designed for only a single pair of sensor modalities (i.e., a bi-modal setup). Unlike these methods, we propose LiREC-Net, a target-free, learning-based calibration network that jointly calibrates multiple sensor modality pairs, including LiDAR, RGB, and event data, within a unified framework. To reduce redundant computation and improve efficiency, we introduce a shared LiDAR representation that leverages features from both its 3D nature and projected depth map, ensuring better consistency across modalities. Trained and evaluated on established datasets, such as KITTI and DSEC, our LiREC-Net achieves competitive performance to bi-modal models and sets a new strong baseline for the tri-modal use case.

LiREC-Net: A Target-Free and Learning-Based Network for LiDAR, RGB, and Event Calibration

TL;DR

The LiREC-Net is proposed, a target-free, learning-based calibration network that jointly calibrates multiple sensor modality pairs, including LiDAR, RGB, and event data, within a unified framework that achieves competitive performance to bi-modal models and sets a new strong baseline for the tri-modal use case.

Abstract

Advanced autonomous systems rely on multi-sensor fusion for safer and more robust perception. To enable effective fusion, calibrating directly from natural driving scenes (i.e., target-free) with high accuracy is crucial for precise multi-sensor alignment. Existing learning-based calibration methods are typically designed for only a single pair of sensor modalities (i.e., a bi-modal setup). Unlike these methods, we propose LiREC-Net, a target-free, learning-based calibration network that jointly calibrates multiple sensor modality pairs, including LiDAR, RGB, and event data, within a unified framework. To reduce redundant computation and improve efficiency, we introduce a shared LiDAR representation that leverages features from both its 3D nature and projected depth map, ensuring better consistency across modalities. Trained and evaluated on established datasets, such as KITTI and DSEC, our LiREC-Net achieves competitive performance to bi-modal models and sets a new strong baseline for the tri-modal use case.
Paper Structure (41 sections, 9 equations, 8 figures, 12 tables)

This paper contains 41 sections, 9 equations, 8 figures, 12 tables.

Figures (8)

  • Figure 1: Our LiREC-Net takes miscalibrated tri-modal inputs and learns to produce spatially aligned outputs. The top row shows the raw miscalibrated overlays, while the bottom row illustrates the calibrated LiDAR projected onto both RGB and event frames.
  • Figure 2: Overview of LiREC-Net. A miscalibrated LiDAR point cloud $P$ is processed by two LiDAR encoders (point- and depth-based). Point features are projected to the image plane using known intrinsics $\mathbf{K}_{\mathrm{RGB}}$ and $\mathbf{K}_{\mathrm{Ev}}$, and then fused with depth features to form a unified LiDAR embedding. In parallel, the RGB image $I$ and event representation $E$ are encoded by their respective encoders. The unified LiDAR embedding is combined with the corresponding RGB/event features to build two pair-wise cost volumes, which are refined by context modules and passed to prediction heads that output the LiDAR-RGB and LiDAR-Event extrinsics $\hat{\mathbf T}^{\mathrm{Li}-\mathrm{RGB}}$ and $\hat{\mathbf T}^{\mathrm{Li}-\mathrm{Ev}}$.
  • Figure 3: Qualitative results on KITTI kitti for both LiDAR-RGB and LiDAR-Event pairs.
  • Figure 4: Qualitative results on DSEC DSEC for both LiDAR-RGB and LiDAR-Event pairs.
  • Figure 5: Per-stage visualization for LiDAR-RGB calibration on KITTI. The exact translation and rotation error is provided for the perturbed input and each stage. For miscalibrated input, there are no LiDAR points inside the RGB image boundary because of the large perturbation.
  • ...and 3 more figures