OKVIS2-X: Open Keyframe-based Visual-Inertial SLAM Configurable with Dense Depth or LiDAR, and GNSS

Simon Boche; Jaehyung Jung; Sebastián Barbas Laina; Stefan Leutenegger

OKVIS2-X: Open Keyframe-based Visual-Inertial SLAM Configurable with Dense Depth or LiDAR, and GNSS

Simon Boche, Jaehyung Jung, Sebastián Barbas Laina, Stefan Leutenegger

TL;DR

OKVIS2-X tackles the challenge of robust, accurate SLAM in large-scale environments by unifying visual-inertial sensing with dense volumetric occupancy mapping and optional depth or LiDAR inputs, all fused within a single factor-graph optimization. It introduces submap-based occupancy maps tightly integrated with the estimator, online camera-IMU extrinsics calibration, and GNSS fusion, enabling globally consistent maps and real-time operation up to $9\,\mathrm{km}$ trajectories. The approach is validated across EuRoC, Hilti-Oxford, and VBR datasets, showing state-of-the-art results in VI and VI-LiDAR configurations and demonstrating resilience to GNSS outages. The work advances practical autonomous navigation by delivering dense, usable maps with strong accuracy, scalability, and robustness, and it is released as open source for the community.

Abstract

To empower mobile robots with usable maps as well as highest state estimation accuracy and robustness, we present OKVIS2-X: a state-of-the-art multi-sensor Simultaneous Localization and Mapping (SLAM) system building dense volumetric occupancy maps, while scalable to large environments and operating in realtime. Our unified SLAM framework seamlessly integrates different sensor modalities: visual, inertial, measured or learned depth, LiDAR and Global Navigation Satellite System (GNSS) measurements. Unlike most state-of-the-art SLAM systems, we advocate using dense volumetric map representations when leveraging depth or range-sensing capabilities. We employ an efficient submapping strategy that allows our system to scale to large environments, showcased in sequences of up to 9 kilometers. OKVIS2-X enhances its accuracy and robustness by tightly-coupling the estimator and submaps through map alignment factors. Our system provides globally consistent maps, directly usable for autonomous navigation. To further improve the accuracy of OKVIS2-X, we also incorporate the option of performing online calibration of camera extrinsics. Our system achieves the highest trajectory accuracy in EuRoC against state-of-the-art alternatives, outperforms all competitors in the Hilti22 VI-only benchmark, while also proving competitive in the LiDAR version, and showcases state of the art accuracy in the diverse and large-scale sequences from the VBR dataset.

OKVIS2-X: Open Keyframe-based Visual-Inertial SLAM Configurable with Dense Depth or LiDAR, and GNSS

TL;DR

Abstract

OKVIS2-X: Open Keyframe-based Visual-Inertial SLAM Configurable with Dense Depth or LiDAR, and GNSS

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (14)