Table of Contents
Fetching ...

Online Object-Oriented Semantic Mapping and Map Updating

Nils Dengler, Tobias Zaenker, Francesco Verdoja, Maren Bennewitz

TL;DR

The paper addresses robust online semantic mapping for indoor service robots by introducing a modular, object-centered map that stores per-object label, 3D point cloud, a 2D polygon, and an oriented bounding box, together with an existence likelihood to cope with dynamic changes and false detections. The approach combines RGB-D detections with point-cloud-based geometric segmentation, a robust data association using an R-tree, and an object refinement mechanism to undo incorrect merges, yielding multiple representations per object. A per-object likelihood L_i governs object persistence and deletion, enabling the map to adapt to object motion and occlusions while maintaining a bounded history via a deletion threshold τ. Empirical evaluation on two robots across four real-world scenes shows competitive IoU and distance metrics compared to Zaenker et al.'s Hypermap, with online performance around 10–12 Hz and a detailed runtime breakdown that highlights the detector’s contribution to overall latency.

Abstract

Creating and maintaining an accurate representation of the environment is an essential capability for every service robot. Especially for household robots acting in indoor environments, semantic information is important. In this paper, we present a semantic mapping framework with modular map representations. Our system is capable of online mapping and object updating given object detections from RGB-D data and provides various 2D and 3D~representations of the mapped objects. To undo wrong data associations, we perform a refinement step when updating object shapes. Furthermore, we maintain an existence likelihood for each object to deal with false positive and false negative detections and keep the map updated. Our mapping system is highly efficient and achieves a run time of more than 10 Hz. We evaluated our approach in various environments using two different robots, i.e., a Toyota HSR and a Fraunhofer Care-O-Bot-4. As the experimental results demonstrate, our system is able to generate maps that are close to the ground truth and outperforms an existing approach in terms of intersection over union, different distance metrics, and the number of correct object mappings

Online Object-Oriented Semantic Mapping and Map Updating

TL;DR

The paper addresses robust online semantic mapping for indoor service robots by introducing a modular, object-centered map that stores per-object label, 3D point cloud, a 2D polygon, and an oriented bounding box, together with an existence likelihood to cope with dynamic changes and false detections. The approach combines RGB-D detections with point-cloud-based geometric segmentation, a robust data association using an R-tree, and an object refinement mechanism to undo incorrect merges, yielding multiple representations per object. A per-object likelihood L_i governs object persistence and deletion, enabling the map to adapt to object motion and occlusions while maintaining a bounded history via a deletion threshold τ. Empirical evaluation on two robots across four real-world scenes shows competitive IoU and distance metrics compared to Zaenker et al.'s Hypermap, with online performance around 10–12 Hz and a detailed runtime breakdown that highlights the detector’s contribution to overall latency.

Abstract

Creating and maintaining an accurate representation of the environment is an essential capability for every service robot. Especially for household robots acting in indoor environments, semantic information is important. In this paper, we present a semantic mapping framework with modular map representations. Our system is capable of online mapping and object updating given object detections from RGB-D data and provides various 2D and 3D~representations of the mapped objects. To undo wrong data associations, we perform a refinement step when updating object shapes. Furthermore, we maintain an existence likelihood for each object to deal with false positive and false negative detections and keep the map updated. Our mapping system is highly efficient and achieves a run time of more than 10 Hz. We evaluated our approach in various environments using two different robots, i.e., a Toyota HSR and a Fraunhofer Care-O-Bot-4. As the experimental results demonstrate, our system is able to generate maps that are close to the ground truth and outperforms an existing approach in terms of intersection over union, different distance metrics, and the number of correct object mappings

Paper Structure

This paper contains 15 sections, 1 equation, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Example usecase of our proposed system. A user commands the robot to get a specific cup (upper left image). To get the position of the cup, the robot checks the created map for cups near a coffee machine. The lower image shows a 2D polygonal representation of the environment mapped with our framework. With the 2D information, the robot can navigate to the goal destination. To grab the cup, the robot then utilizes the 3D representation of the objects (upper right image), which is also maintained by our system.
  • Figure 2: Overview of our semantic mapping system. The squared boxes indicate the external inputs.
  • Figure 3: Office working place, observed by the robot. (a) visualizes the whole point cloud with the three filtered surfaces colored in red (wall), blue (table) and green (floor). All remaining parts of the point cloud after the preprocessing are colored white. Figures (b) to (d) show the mapped objects in three different representations (point cloud, polygon, and oriented bounding box (OBB)).
  • Figure 4: Example of the mapping part of the environment (a) with and (b) without object shape refinement during updates. Shown are the object point cloud, the 2D polygons and the label. In (a) the two chairs are clearly separated, while in (b) they are merged together due to many outlieres resulting from, e.g., wrong segmentation or localization unaccuracy.
  • Figure 5: (a) Visualization of the handcrafted ground truth map, (b) result of the presented approach, and (c) result of the approach by Zaenker et al. zaenker. Each object is represented by a polygon, colored corresponding to the object class shown in (d). While our approach achieves a result close to the ground truth, the hypermap is just a rough approximation.
  • ...and 1 more figures