Table of Contents
Fetching ...

Open-Set Semantic Uncertainty Aware Metric-Semantic Graph Matching

Kurran Singh, John J. Leonard

TL;DR

This work tackles open-set object-based SLAM in challenging underwater environments by introducing a semantic-uncertainty-aware graph matching framework. It represents detected objects as 384-dimensional semantic embeddings with per-object uncertainty and casts open-set place recognition as a Quadratic Assignment Problem with node and edge affinities, including WeightedCosineSim, Mahalanobis, and Bhattacharyya-based measures. The method combines uncertainty quantification, graph matching solvers (spectral, RRWM, A*, neural), and a local-map construction pipeline with Kalman-filter-based uncertainty tracking, demonstrated on underwater data and KITTI to show real-time feasibility and cross-domain generalization. Key findings show that weighted cosine affinity often yields best tradeoffs, A* achieves top accuracy on small graphs, RRWM scales well to larger graphs, and the approach robustly handles unseen object classes for loop closure and map merging. The work thus enables robust, open-set, multi-object, semantic-uncertainty-aware loop closure in marine environments and beyond.

Abstract

Underwater object-level mapping requires incorporating visual foundation models to handle the uncommon and often previously unseen object classes encountered in marine scenarios. In this work, a metric of semantic uncertainty for open-set object detections produced by visual foundation models is calculated and then incorporated into an object-level uncertainty tracking framework. Object-level uncertainties and geometric relationships between objects are used to enable robust object-level loop closure detection for unknown object classes. The above loop closure detection problem is formulated as a graph-matching problem. While graph matching, in general, is NP-Complete, a solver for an equivalent formulation of the proposed graph matching problem as a graph editing problem is tested on multiple challenging underwater scenes. Results for this solver as well as three other solvers demonstrate that the proposed methods are feasible for real-time use in marine environments for the robust, open-set, multi-object, semantic-uncertainty-aware loop closure detection. Further experimental results on the KITTI dataset demonstrate that the method generalizes to large-scale terrestrial scenes.

Open-Set Semantic Uncertainty Aware Metric-Semantic Graph Matching

TL;DR

This work tackles open-set object-based SLAM in challenging underwater environments by introducing a semantic-uncertainty-aware graph matching framework. It represents detected objects as 384-dimensional semantic embeddings with per-object uncertainty and casts open-set place recognition as a Quadratic Assignment Problem with node and edge affinities, including WeightedCosineSim, Mahalanobis, and Bhattacharyya-based measures. The method combines uncertainty quantification, graph matching solvers (spectral, RRWM, A*, neural), and a local-map construction pipeline with Kalman-filter-based uncertainty tracking, demonstrated on underwater data and KITTI to show real-time feasibility and cross-domain generalization. Key findings show that weighted cosine affinity often yields best tradeoffs, A* achieves top accuracy on small graphs, RRWM scales well to larger graphs, and the approach robustly handles unseen object classes for loop closure and map merging. The work thus enables robust, open-set, multi-object, semantic-uncertainty-aware loop closure in marine environments and beyond.

Abstract

Underwater object-level mapping requires incorporating visual foundation models to handle the uncommon and often previously unseen object classes encountered in marine scenarios. In this work, a metric of semantic uncertainty for open-set object detections produced by visual foundation models is calculated and then incorporated into an object-level uncertainty tracking framework. Object-level uncertainties and geometric relationships between objects are used to enable robust object-level loop closure detection for unknown object classes. The above loop closure detection problem is formulated as a graph-matching problem. While graph matching, in general, is NP-Complete, a solver for an equivalent formulation of the proposed graph matching problem as a graph editing problem is tested on multiple challenging underwater scenes. Results for this solver as well as three other solvers demonstrate that the proposed methods are feasible for real-time use in marine environments for the robust, open-set, multi-object, semantic-uncertainty-aware loop closure detection. Further experimental results on the KITTI dataset demonstrate that the method generalizes to large-scale terrestrial scenes.
Paper Structure (12 sections, 15 equations, 5 figures, 2 tables)

This paper contains 12 sections, 15 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: An overview of the proposed approach: In the initial local map building phase (top row), a vehicle navigates through an underwater scene and constructs a local map of the environment using latent vectors to represent objects detected by an open-set detector and localized using sonar information. Later (bottom row), the vehicle sees part of the same scene as in the initial pass, and creates a local map that has overlap with the initial scene. Next, in the problem formulation phase, the two local maps are converted into graphs where the nodes are uncertainty weighted latent vectors for each object, and the edges are the relative distances between the objects. From these graphs, an affinity matrix is created, using the node and edge affinity functions defined in \ref{['sec:formulation']}. By using a graph matching solver, a soft matching matrix is produced. The soft matching matrix is used to then obtain a hard matching matrix via the Hungarian algorithm. Finally, the hard match is used to visualize the object-level correspondences.
  • Figure 2: Images with low ($< 0.15$) uncertainty.
  • Figure 3: Images with high ($> 0.35$) uncertainty.
  • Figure 4: Matching results from run 1. The method is able to find the correct correspondences for the 4 node submap in the original full map despite noisy object locations due to noise from sonar-based ranging, as well as handling the object detection uncertainties from detecting objects with underwater lighting effects.
  • Figure 5: Results on the KITTI dataset demonstrate that the technique is generalizable to terrestrial settings, while also being feasible for larger scale graph matching. The different colors represent different object classes. Subgraphs were extracted at the following locations where loop closure opportunities arise in the trajectory. The correspondences are drawn for illustrative purposes.