Table of Contents
Fetching ...

LCP-Fusion: A Neural Implicit SLAM with Enhanced Local Constraints and Computable Prior

Jiahui Wang, Yinan Deng, Yi Yang, Yufeng Yue

TL;DR

LCP-Fusion is presented, a neural implicit SLAM system with enhanced local constraints and computable prior, which takes the sparse voxel octree structure containing feature grids and SDF priors as hybrid scene representation, enabling the scalability and robustness during mapping and tracking.

Abstract

Recently the dense Simultaneous Localization and Mapping (SLAM) based on neural implicit representation has shown impressive progress in hole filling and high-fidelity mapping. Nevertheless, existing methods either heavily rely on known scene bounds or suffer inconsistent reconstruction due to drift in potential loop-closure regions, or both, which can be attributed to the inflexible representation and lack of local constraints. In this paper, we present LCP-Fusion, a neural implicit SLAM system with enhanced local constraints and computable prior, which takes the sparse voxel octree structure containing feature grids and SDF priors as hybrid scene representation, enabling the scalability and robustness during mapping and tracking. To enhance the local constraints, we propose a novel sliding window selection strategy based on visual overlap to address the loop-closure, and a practical warping loss to constrain relative poses. Moreover, we estimate SDF priors as coarse initialization for implicit features, which brings additional explicit constraints and robustness, especially when a light but efficient adaptive early ending is adopted. Experiments demonstrate that our method achieve better localization accuracy and reconstruction consistency than existing RGB-D implicit SLAM, especially in challenging real scenes (ScanNet) as well as self-captured scenes with unknown scene bounds. The code is available at https://github.com/laliwang/LCP-Fusion.

LCP-Fusion: A Neural Implicit SLAM with Enhanced Local Constraints and Computable Prior

TL;DR

LCP-Fusion is presented, a neural implicit SLAM system with enhanced local constraints and computable prior, which takes the sparse voxel octree structure containing feature grids and SDF priors as hybrid scene representation, enabling the scalability and robustness during mapping and tracking.

Abstract

Recently the dense Simultaneous Localization and Mapping (SLAM) based on neural implicit representation has shown impressive progress in hole filling and high-fidelity mapping. Nevertheless, existing methods either heavily rely on known scene bounds or suffer inconsistent reconstruction due to drift in potential loop-closure regions, or both, which can be attributed to the inflexible representation and lack of local constraints. In this paper, we present LCP-Fusion, a neural implicit SLAM system with enhanced local constraints and computable prior, which takes the sparse voxel octree structure containing feature grids and SDF priors as hybrid scene representation, enabling the scalability and robustness during mapping and tracking. To enhance the local constraints, we propose a novel sliding window selection strategy based on visual overlap to address the loop-closure, and a practical warping loss to constrain relative poses. Moreover, we estimate SDF priors as coarse initialization for implicit features, which brings additional explicit constraints and robustness, especially when a light but efficient adaptive early ending is adopted. Experiments demonstrate that our method achieve better localization accuracy and reconstruction consistency than existing RGB-D implicit SLAM, especially in challenging real scenes (ScanNet) as well as self-captured scenes with unknown scene bounds. The code is available at https://github.com/laliwang/LCP-Fusion.

Paper Structure

This paper contains 14 sections, 10 equations, 10 figures, 4 tables.

Figures (10)

  • Figure 1: Example from baseline yang2022vox of inconsistent surfaces due to drift in potential loop-closure regions composed of frames 119, 3449 and 5549 (top). Our method can reconstruct unknown scenes with less drift utilizing enhanced local constraints and easily computable SDF priors (bottom).
  • Figure 2: Overview of our LCP-Fusion system. Receiving RGB-D inputs with initialized poses from the tracking process, we jointly optimize hybrid scene representation and camera poses among our sliding window, which contains more visual overlap with the current frame in potential loop closure regions. Additionally, the proposed warping loss can obtain sufficient constraints for relative poses with visual overlap.
  • Figure 3: Visualization of SDF prior estimates. To avoid unreasonable SDF priors due to occlusion, we indicate three extreme cases from left to right.
  • Figure 4: Aerial view of reconstruction on scene0181 dai2017scannet using different sliding window designs. Through comparison, our sliding window selection method can not only improve localization but also alleviate forgetting.
  • Figure 5: Reconstructed mesh results of potential loop-closure regions in ScanNet. For comparison, we highlight the regions with inconsistent surfaces by red boxes, while green boxes for consistent surface.
  • ...and 5 more figures