Table of Contents
Fetching ...

V3D-SLAM: Robust RGB-D SLAM in Dynamic Environments with 3D Semantic Geometry Voting

Tuan Dang, Khang Nguyen, Mandfred Huber

TL;DR

A robust method to remove moving objects via two lightweight reevaluation stages, including identifying potentially moving and static objects using a spatial-reasoned Hough voting mechanism and refining static objects by detecting dynamic noise caused by intra-object motions using Chamfer distances as similarity measurements is proposed.

Abstract

Simultaneous localization and mapping (SLAM) in highly dynamic environments is challenging due to the correlation complexity between moving objects and the camera pose. Many methods have been proposed to deal with this problem; however, the moving properties of dynamic objects with a moving camera remain unclear. Therefore, to improve SLAM's performance, minimizing disruptive events of moving objects with a physical understanding of 3D shapes and dynamics of objects is needed. In this paper, we propose a robust method, V3D-SLAM, to remove moving objects via two lightweight re-evaluation stages, including identifying potentially moving and static objects using a spatial-reasoned Hough voting mechanism and refining static objects by detecting dynamic noise caused by intra-object motions using Chamfer distances as similarity measurements. Our experiment on the TUM RGB-D benchmark on dynamic sequences with ground-truth camera trajectories showed that our methods outperform the most recent state-of-the-art SLAM methods. Our source code is available at https://github.com/tuantdang/v3d-slam.

V3D-SLAM: Robust RGB-D SLAM in Dynamic Environments with 3D Semantic Geometry Voting

TL;DR

A robust method to remove moving objects via two lightweight reevaluation stages, including identifying potentially moving and static objects using a spatial-reasoned Hough voting mechanism and refining static objects by detecting dynamic noise caused by intra-object motions using Chamfer distances as similarity measurements is proposed.

Abstract

Simultaneous localization and mapping (SLAM) in highly dynamic environments is challenging due to the correlation complexity between moving objects and the camera pose. Many methods have been proposed to deal with this problem; however, the moving properties of dynamic objects with a moving camera remain unclear. Therefore, to improve SLAM's performance, minimizing disruptive events of moving objects with a physical understanding of 3D shapes and dynamics of objects is needed. In this paper, we propose a robust method, V3D-SLAM, to remove moving objects via two lightweight re-evaluation stages, including identifying potentially moving and static objects using a spatial-reasoned Hough voting mechanism and refining static objects by detecting dynamic noise caused by intra-object motions using Chamfer distances as similarity measurements. Our experiment on the TUM RGB-D benchmark on dynamic sequences with ground-truth camera trajectories showed that our methods outperform the most recent state-of-the-art SLAM methods. Our source code is available at https://github.com/tuantdang/v3d-slam.

Paper Structure

This paper contains 21 sections, 4 equations, 6 figures, 3 tables, 1 algorithm.

Figures (6)

  • Figure 1: Overview of V3D-SLAM: improving the robustness of RGB-D SLAM in dynamic indoor environments, including instance segmentation coupled with RGB-based feature extraction (Sec. \ref{['sec:segmentation_extraction']}), sensor noises and segmentation outlier rejection (Sec. \ref{['sec:outlier_rejection']}), and spatial-reasoned Hough voting mechanism for dynamic 3D objects (Sec. \ref{['sec:geometry_voting']}), resulting in camera trajectory estimation (Sec. \ref{['sec:evaluation']}).
  • Figure 2: Segmentation of hypothesized moving objects with key points on static objects and background, and point clouds of object instances.
  • Figure 3: Outlier removal with semantic perception on point clouds.
  • Figure 4: Spatial-reasoned Hough voting mechanism for moving objects (right) after computing accumulator array in previous (left) and current (middle) frames. Red lines illustrate Euclidean distances between one of the 'person' objects and other presented entities. The moving 'person' object and the 'chair' object are identified using Alg. \ref{['alg:voting_dynamic']}; meanwhile, the other 'person' object is only identified as an intra-moving object as he wobbles his head and his 3D centroid does not exceed the distance threshold for deformable objects.
  • Figure 5: Qualitative results of camera trajectories of TUM RGB-D dynamic sequences estimated by our method and CFP-SLAM hu2022cfp. The ground truth, the estimated trajectory, and their differences are encoded as black lines, blue lines, and red lines, respectively.
  • ...and 1 more figures