Table of Contents
Fetching ...

Finding Things in the Unknown: Semantic Object-Centric Exploration with an MAV

Sotiris Papatheodorou, Nils Funk, Dimos Tzoumanikas, Christopher Choi, Binbin Xu, Stefan Leutenegger

TL;DR

This work created a Micro Aerial Vehicle (MAV) semantic exploration simulator based on Habitat in order to quantitatively demonstrate how the framework can be used to efficiently find specific objects as part of exploration.

Abstract

Exploration of unknown space with an autonomous mobile robot is a well-studied problem. In this work we broaden the scope of exploration, moving beyond the pure geometric goal of uncovering as much free space as possible. We believe that for many practical applications, exploration should be contextualised with semantic and object-level understanding of the environment for task-specific exploration. Here, we study the task of both finding specific objects in unknown space as well as reconstructing them to a target level of detail. We therefore extend our environment reconstruction to not only consist of a background map, but also object-level and semantically fused submaps. Importantly, we adapt our previous objective function of uncovering as much free space as possible in as little time as possible with two additional elements: first, we require a maximum observation distance of background surfaces to ensure target objects are not missed by image-based detectors because they are too small to be detected. Second, we require an even smaller maximum distance to the found objects in order to reconstruct them with the desired accuracy. We further created a Micro Aerial Vehicle (MAV) semantic exploration simulator based on Habitat in order to quantitatively demonstrate how our framework can be used to efficiently find specific objects as part of exploration. Finally, we showcase this capability can be deployed in real-world scenes involving our drone equipped with an Intel RealSense D455 RGB-D camera.

Finding Things in the Unknown: Semantic Object-Centric Exploration with an MAV

TL;DR

This work created a Micro Aerial Vehicle (MAV) semantic exploration simulator based on Habitat in order to quantitatively demonstrate how the framework can be used to efficiently find specific objects as part of exploration.

Abstract

Exploration of unknown space with an autonomous mobile robot is a well-studied problem. In this work we broaden the scope of exploration, moving beyond the pure geometric goal of uncovering as much free space as possible. We believe that for many practical applications, exploration should be contextualised with semantic and object-level understanding of the environment for task-specific exploration. Here, we study the task of both finding specific objects in unknown space as well as reconstructing them to a target level of detail. We therefore extend our environment reconstruction to not only consist of a background map, but also object-level and semantically fused submaps. Importantly, we adapt our previous objective function of uncovering as much free space as possible in as little time as possible with two additional elements: first, we require a maximum observation distance of background surfaces to ensure target objects are not missed by image-based detectors because they are too small to be detected. Second, we require an even smaller maximum distance to the found objects in order to reconstruct them with the desired accuracy. We further created a Micro Aerial Vehicle (MAV) semantic exploration simulator based on Habitat in order to quantitatively demonstrate how our framework can be used to efficiently find specific objects as part of exploration. Finally, we showcase this capability can be deployed in real-world scenes involving our drone equipped with an Intel RealSense D455 RGB-D camera.
Paper Structure (28 sections, 2 equations, 9 figures, 1 table)

This paper contains 28 sections, 2 equations, 9 figures, 1 table.

Figures (9)

  • Figure 1: Objects reconstructed using our method [Top left] and their corresponding ground-truth meshes provided by the Matterport3D dataset 2017_3DV_Chang [Bottom left]. Top-down view of 3D reconstruction after exploration, and MAV path in yellow [Right].
  • Figure 2: Diagram of the proposed approach. The mapping module receives depth and colour image pairs and the corresponding instance segmentation and pose to update the background and individual object maps and, the set of frontiers. The planning module receives the background and object maps, the set of frontiers and, the current pose to produce the next goal path. Planning happens again when the MAV has completed the goal path.
  • Figure 3: [Left] Candidate view sampling near frontiers and objects, and path planning to candidates from the current pose. [Middle] Sparse $360^\circ$ raycasting from one of the candidate views and optimal yaw frustum in red. [Right] Entropy, background, object and combined gain images from the raycast shown in [Middle] with the optimal yaw field-of-view in red.
  • Figure 4: Objects reconstructed using the proposed method [Top] and their corresponding ground-truth meshes [Bottom]. Notice some artefacts due to erroneous segmentation masks.
  • Figure 5: Median, $10^{th}$ and $90^{th}$ percentiles of explored volume [Left] and percentage of objects found [Right].
  • ...and 4 more figures