Table of Contents
Fetching ...

SeGuE: Semantic Guided Exploration for Mobile Robots

Cody Simons, Aritra Samanta, Amit K. Roy-Chowdhury, Konstantinos Karydis

TL;DR

SeGuE addresses semantic exploration by integrating semantic feature maps into a next-best-view framework for mobile robots. It scores potential viewpoints by the average entropy of visible semantic features, while tracking convergence to avoid over-exploitation of uncertain areas, and it offers two sampling strategies—Uniform and Importance Sampling. The approach is validated in both simulation and real-world experiments, showing higher semantic map coverage and lower semantic uncertainty than baselines. This work enables automatic, high-quality semantic mapping that can support downstream embodied AI tasks and robust deployment across diverse environments.

Abstract

The rise of embodied AI applications has enabled robots to perform complex tasks which require a sophisticated understanding of their environment. To enable successful robot operation in such settings, maps must be constructed so that they include semantic information, in addition to geometric information. In this paper, we address the novel problem of semantic exploration, whereby a mobile robot must autonomously explore an environment to fully map both its structure and the semantic appearance of features. We develop a method based on next-best-view exploration, where potential poses are scored based on the semantic features visible from that pose. We explore two alternative methods for sampling potential views and demonstrate the effectiveness of our framework in both simulation and physical experiments. Automatic creation of high-quality semantic maps can enable robots to better understand and interact with their environments and enable future embodied AI applications to be more easily deployed.

SeGuE: Semantic Guided Exploration for Mobile Robots

TL;DR

SeGuE addresses semantic exploration by integrating semantic feature maps into a next-best-view framework for mobile robots. It scores potential viewpoints by the average entropy of visible semantic features, while tracking convergence to avoid over-exploitation of uncertain areas, and it offers two sampling strategies—Uniform and Importance Sampling. The approach is validated in both simulation and real-world experiments, showing higher semantic map coverage and lower semantic uncertainty than baselines. This work enables automatic, high-quality semantic mapping that can support downstream embodied AI tasks and robust deployment across diverse environments.

Abstract

The rise of embodied AI applications has enabled robots to perform complex tasks which require a sophisticated understanding of their environment. To enable successful robot operation in such settings, maps must be constructed so that they include semantic information, in addition to geometric information. In this paper, we address the novel problem of semantic exploration, whereby a mobile robot must autonomously explore an environment to fully map both its structure and the semantic appearance of features. We develop a method based on next-best-view exploration, where potential poses are scored based on the semantic features visible from that pose. We explore two alternative methods for sampling potential views and demonstrate the effectiveness of our framework in both simulation and physical experiments. Automatic creation of high-quality semantic maps can enable robots to better understand and interact with their environments and enable future embodied AI applications to be more easily deployed.

Paper Structure

This paper contains 19 sections, 1 equation, 4 figures, 3 tables, 1 algorithm.

Figures (4)

  • Figure 1: An example of the view mask generated by our raytracing algorithm. Known obstacles are shown in black and the view mask is shown in gray.
  • Figure 2: Visualization of the mapping results of (a-b) the frontier-based baseline and (c-d) SeGuE in the Small House environment. We show both the occupancy grid and semantic prediction, computed from the map of semantic features. On the occupancy grid, we plot the trajectory of the robot.
  • Figure 3: Visualization of the mapping results of (a-b) the frontier-based baseline and (c-d) SeGuE in the Bookstore environment. We show both the occupancy grid and semantic prediction, computed from the map of semantic features. On the occupancy grid, we plot the trajectory of the robot.
  • Figure 4: Mapping results of SeGuE in real-world experiments.