Table of Contents
Fetching ...

DBA-Fusion: Tightly Integrating Deep Dense Visual Bundle Adjustment with Multiple Sensors for Large-Scale Localization and Mapping

Yuxuan Zhou, Xingxing Li, Shengyu Li, Xuanbin Wang, Shaoquan Feng, Yuxuan Tan

TL;DR

This letter tightly integrate the trainable deep dense bundle adjustment (DBA) with multi-sensor information through a factor graph for multi-sensor fusion, which enables real-time dense mapping in large-scale environments.

Abstract

Visual simultaneous localization and mapping (VSLAM) has broad applications, with state-of-the-art methods leveraging deep neural networks for better robustness and applicability. However, there is a lack of research in fusing these learning-based methods with multi-sensor information, which could be indispensable to push related applications to large-scale and complex scenarios. In this paper, we tightly integrate the trainable deep dense bundle adjustment (DBA) with multi-sensor information through a factor graph. In the framework, recurrent optical flow and DBA are performed among sequential images. The Hessian information derived from DBA is fed into a generic factor graph for multi-sensor fusion, which employs a sliding window and supports probabilistic marginalization. A pipeline for visual-inertial integration is firstly developed, which provides the minimum ability of metric-scale localization and mapping. Furthermore, other sensors (e.g., global navigation satellite system) are integrated for driftless and geo-referencing functionality. Extensive tests are conducted on both public datasets and self-collected datasets. The results validate the superior localization performance of our approach, which enables real-time dense mapping in large-scale environments. The code has been made open-source (https://github.com/GREAT-WHU/DBA-Fusion).

DBA-Fusion: Tightly Integrating Deep Dense Visual Bundle Adjustment with Multiple Sensors for Large-Scale Localization and Mapping

TL;DR

This letter tightly integrate the trainable deep dense bundle adjustment (DBA) with multi-sensor information through a factor graph for multi-sensor fusion, which enables real-time dense mapping in large-scale environments.

Abstract

Visual simultaneous localization and mapping (VSLAM) has broad applications, with state-of-the-art methods leveraging deep neural networks for better robustness and applicability. However, there is a lack of research in fusing these learning-based methods with multi-sensor information, which could be indispensable to push related applications to large-scale and complex scenarios. In this paper, we tightly integrate the trainable deep dense bundle adjustment (DBA) with multi-sensor information through a factor graph. In the framework, recurrent optical flow and DBA are performed among sequential images. The Hessian information derived from DBA is fed into a generic factor graph for multi-sensor fusion, which employs a sliding window and supports probabilistic marginalization. A pipeline for visual-inertial integration is firstly developed, which provides the minimum ability of metric-scale localization and mapping. Furthermore, other sensors (e.g., global navigation satellite system) are integrated for driftless and geo-referencing functionality. Extensive tests are conducted on both public datasets and self-collected datasets. The results validate the superior localization performance of our approach, which enables real-time dense mapping in large-scale environments. The code has been made open-source (https://github.com/GREAT-WHU/DBA-Fusion).
Paper Structure (14 sections, 16 equations, 13 figures, 3 tables)

This paper contains 14 sections, 16 equations, 13 figures, 3 tables.

Figures (13)

  • Figure 1: Illustration of the DBA-Fusion system. The pointcloud map is generated online via monucular visual-inertial-GNSS integration. For clarity, we only show 5x downsampled camera poses.
  • Figure 2: Illustration of the recurrent optical flow. The images are firstly encoded to feature maps. One co-visible image pair $(i,j)$ is fed into the recurrent optical flow module, as described in Sect. III-A.
  • Figure 3: Workflow of DBA in the optimization framework, which employs a hierarchical iteration structure. The right panel shows the elimination of the depth state, which generates the Hessian $\mathbf{H}_{c,\text{all}}$,$\mathbf{v}_{c,\text{all}}$ as the visual constraint. For marginalization, only related edges are processed to generate $\mathbf{H}_{c,\text{marg}}$,$\mathbf{v}_{c,\text{marg}}$ that contains the marginalized visual information.
  • Figure 4: Marginalization of the factor graph. Visual information as well as other information related to the oldest states (black edges) is marginalized, which uses Schur complement to eliminate the oldest states and form a new constraintref_okvisref_gtsam. The re-projections from newer frames to the old frame (red edges) are neglected to keep sparsity.
  • Figure 5: The generic pose-centered factor graph that is used for multi-sensor fusion. Note that the visual factor is modeled using a Hessian factor where the point depths are temporarily eliminated. The visual factor is iteratively updated with the recurrent optical flow and DBA Hessian formulation.
  • ...and 8 more figures