Drift-free Visual SLAM using Digital Twins
Roxane Merat, Giovanni Cioffi, Leonard Bauersfeld, Davide Scaramuzza
TL;DR
This work tackles drift in visual-inertial SLAM for urban operation by localizing the VIO/VSLAM-generated sparse 3D point cloud to a city digital twin using point-to-plane ICP, producing a global 6-DoF measurement integrated into the SLAM back-end. It introduces an adaptive weighting scheme to stabilize the map-registration residuals in degenerate scenes and provides an initial frame alignment that transitions to map-based alignment once convergence is achieved. The approach, implemented as SVO-Digital Twin, is validated in both a high-fidelity GPS simulation and real-world drone flights, showing superior drift reduction and robustness to viewpoint changes compared with state-of-the-art VIO-GPS and visual localization methods. The results demonstrate that leveraging digital-twin geometry for global localization can significantly enhance long-term pose reliability in urban environments, enabling more robust autonomous operation.
Abstract
Globally-consistent localization in urban environments is crucial for autonomous systems such as self-driving vehicles and drones, as well as assistive technologies for visually impaired people. Traditional Visual-Inertial Odometry (VIO) and Visual Simultaneous Localization and Mapping (VSLAM) methods, though adequate for local pose estimation, suffer from drift in the long term due to reliance on local sensor data. While GPS counteracts this drift, it is unavailable indoors and often unreliable in urban areas. An alternative is to localize the camera to an existing 3D map using visual-feature matching. This can provide centimeter-level accurate localization but is limited by the visual similarities between the current view and the map. This paper introduces a novel approach that achieves accurate and globally-consistent localization by aligning the sparse 3D point cloud generated by the VIO/VSLAM system to a digital twin using point-to-plane matching; no visual data association is needed. The proposed method provides a 6-DoF global measurement tightly integrated into the VIO/VSLAM system. Experiments run on a high-fidelity GPS simulator and real-world data collected from a drone demonstrate that our approach outperforms state-of-the-art VIO-GPS systems and offers superior robustness against viewpoint changes compared to the state-of-the-art Visual SLAM systems.
