Table of Contents
Fetching ...

Unleashing the Power of Discrete-Time State Representation: Ultrafast Target-based IMU-Camera Spatial-Temporal Calibration

Junlin Song, Antoine Richard, Miguel Olivares-Mendez

TL;DR

The paper addresses the computational bottleneck of continuous-time IMU–camera calibration by adopting a discrete-time formulation that also solves temporal alignment via a time offset. It introduces an on-manifold, higher-order IMU preintegration (Midpoint) to form compact pseudo-measurements and jointly estimate gravity direction and biases alongside spatial-temporal calibration, integrated within a full-batch Levenberg–Marquardt optimization. With an AprilTag-based target, the camera residuals couple spatial calibration and time offset, yielding an efficient calibration framework that preserves VIO accuracy. Empirical results on EuRoC and TUM-VI demonstrate orders-of-magnitude speedups (often hundreds of times faster than Kalibr) while maintaining competitive calibration accuracy, supporting scalable factory calibration for drones, phones, and AR devices.

Abstract

Visual-inertial fusion is crucial for a large amount of intelligent and autonomous applications, such as robot navigation and augmented reality. To bootstrap and achieve optimal state estimation, the spatial-temporal displacements between IMU and cameras must be calibrated in advance. Most existing calibration methods adopt continuous-time state representation, more specifically the B-spline. Despite these methods achieve precise spatial-temporal calibration, they suffer from high computational cost caused by continuous-time state representation. To this end, we propose a novel and extremely efficient calibration method that unleashes the power of discrete-time state representation. Moreover, the weakness of discrete-time state representation in temporal calibration is tackled in this paper. With the increasing production of drones, cellphones and other visual-inertial platforms, if one million devices need calibration around the world, saving one minute for the calibration of each device means saving 2083 work days in total. To benefit both the research and industry communities, the open-source implementation is released at https://github.com/JunlinSong/DT-VI-Calib.

Unleashing the Power of Discrete-Time State Representation: Ultrafast Target-based IMU-Camera Spatial-Temporal Calibration

TL;DR

The paper addresses the computational bottleneck of continuous-time IMU–camera calibration by adopting a discrete-time formulation that also solves temporal alignment via a time offset. It introduces an on-manifold, higher-order IMU preintegration (Midpoint) to form compact pseudo-measurements and jointly estimate gravity direction and biases alongside spatial-temporal calibration, integrated within a full-batch Levenberg–Marquardt optimization. With an AprilTag-based target, the camera residuals couple spatial calibration and time offset, yielding an efficient calibration framework that preserves VIO accuracy. Empirical results on EuRoC and TUM-VI demonstrate orders-of-magnitude speedups (often hundreds of times faster than Kalibr) while maintaining competitive calibration accuracy, supporting scalable factory calibration for drones, phones, and AR devices.

Abstract

Visual-inertial fusion is crucial for a large amount of intelligent and autonomous applications, such as robot navigation and augmented reality. To bootstrap and achieve optimal state estimation, the spatial-temporal displacements between IMU and cameras must be calibrated in advance. Most existing calibration methods adopt continuous-time state representation, more specifically the B-spline. Despite these methods achieve precise spatial-temporal calibration, they suffer from high computational cost caused by continuous-time state representation. To this end, we propose a novel and extremely efficient calibration method that unleashes the power of discrete-time state representation. Moreover, the weakness of discrete-time state representation in temporal calibration is tackled in this paper. With the increasing production of drones, cellphones and other visual-inertial platforms, if one million devices need calibration around the world, saving one minute for the calibration of each device means saving 2083 work days in total. To benefit both the research and industry communities, the open-source implementation is released at https://github.com/JunlinSong/DT-VI-Calib.

Paper Structure

This paper contains 14 sections, 23 equations, 3 figures, 6 tables.

Figures (3)

  • Figure 1: (a) Stereo visual-inertial sensor prototype of the TUM-VI dataset schubert2018tum. (b) The spatial-temporal relationship between IMU and camera.
  • Figure 2: Coordinate frames for the IMU-Camera calibration with a calibration board.
  • Figure 3: Time shift of each IMU motion state corresponding to image. After the time shift of images, ${t_i}$ and ${t_{i + 1}}$ become ${t'_i}$ and ${t'_{i + 1}}$, respectively. ${t'_i} = {t_i} + {t_d}$, ${t'_{i + 1}} = {t_{i + 1}} + {t_d}$.