Table of Contents
Fetching ...

Semantic Masking and Visual Feature Matching for Robust Localization

Luisa Mao, Ryan Soussan, Brian Coltin, Trey Smith, Joydeep Biswas

TL;DR

This work presents a computationally efficient semantic masking approach for visual feature matching that improves the accuracy and robustness of visual localization systems during long-term deployment in changing environments and can be applied to any visual feature matching pipeline to improve robustness.

Abstract

We are interested in long-term deployments of autonomous robots to aid astronauts with maintenance and monitoring operations in settings such as the International Space Station. Unfortunately, such environments tend to be highly dynamic and unstructured, and their frequent reconfiguration poses a challenge for robust long-term localization of robots. Many state-of-the-art visual feature-based localization algorithms are not robust towards spatial scene changes, and SLAM algorithms, while promising, cannot run within the low-compute budget available to space robots. To address this gap, we present a computationally efficient semantic masking approach for visual feature matching that improves the accuracy and robustness of visual localization systems during long-term deployment in changing environments. Our method introduces a lightweight check that enforces matches to be within long-term static objects and have consistent semantic classes. We evaluate this approach using both map-based relocalization and relative pose estimation and show that it improves Absolute Trajectory Error (ATE) and correct match ratios on the publicly available Astrobee dataset. While this approach was originally developed for microgravity robotic freeflyers, it can be applied to any visual feature matching pipeline to improve robustness.

Semantic Masking and Visual Feature Matching for Robust Localization

TL;DR

This work presents a computationally efficient semantic masking approach for visual feature matching that improves the accuracy and robustness of visual localization systems during long-term deployment in changing environments and can be applied to any visual feature matching pipeline to improve robustness.

Abstract

We are interested in long-term deployments of autonomous robots to aid astronauts with maintenance and monitoring operations in settings such as the International Space Station. Unfortunately, such environments tend to be highly dynamic and unstructured, and their frequent reconfiguration poses a challenge for robust long-term localization of robots. Many state-of-the-art visual feature-based localization algorithms are not robust towards spatial scene changes, and SLAM algorithms, while promising, cannot run within the low-compute budget available to space robots. To address this gap, we present a computationally efficient semantic masking approach for visual feature matching that improves the accuracy and robustness of visual localization systems during long-term deployment in changing environments. Our method introduces a lightweight check that enforces matches to be within long-term static objects and have consistent semantic classes. We evaluate this approach using both map-based relocalization and relative pose estimation and show that it improves Absolute Trajectory Error (ATE) and correct match ratios on the publicly available Astrobee dataset. While this approach was originally developed for microgravity robotic freeflyers, it can be applied to any visual feature matching pipeline to improve robustness.

Paper Structure

This paper contains 29 sections, 6 figures, 6 tables.

Figures (6)

  • Figure 1: feature matching with and without bounding boxes. Horizontal image pairs taken several years apart display multiple scene changes, including a rotated ISS flag that causes faulty associations and a failed relative pose estimate in the top image pair. With semantic masks applied to the matches (bottom image pair), detections of stable scene elements including vents (purple), lights (blue), and handrails (red) enable the pruning of faulty associations due to environment changes and successful relative pose estimation.
  • Figure 2: Astrobee free-flying robots roaming the ISS during an activity. Background objects such as laptops, wires, and cargo bags are often moved between flights and can cause localization errors for the robots.
  • Figure 3: The semantic image matching pipeline adds semantic segmentation stages in blue to a visual feature matching pipeline in red to improve pose estimation accuracy. The pipeline detects semantic objects in each image (Fig. \ref{['fig:bounding_boxes']}) and generates masked image-space regions for each detection in each object class (Fig. \ref{['fig:masked_images']}). It then detects visual features in the masked regions and performs matching between features of the same class for each pair of images. Finally, the pipeline estimates the relative pose between the images using the resulting matches.
  • Figure 4: The XYZ position of the Astrobee through time in the tb_yaw sequence is plotted above. The non-semantic localizer accrues a position offset in the middle of the plot (visible as a discontinuous step) whereas the semantic localizer maintains its fixed position.
  • Figure 5: The addition of semantics helps the robot track its orientation during an in-place rotation in the tb_yaw sequence. Here, the non-semantic relocalizer localizes upside-down and observes a reversed yaw around 15 seconds while the semantic version properly tracks the rotation.
  • ...and 1 more figures