Table of Contents
Fetching ...

PanopticNDT: Efficient and Robust Panoptic Mapping

Daniel Seichter, Benedict Stephan, Söhnke Benedikt Fischedick, Steffen Müller, Leonard Rabes, Horst-Michael Gross

TL;DR

PanopticNDT introduces an efficient, robust panoptic mapping framework built on occupancy NDT, integrating dense panoptic segmentation (via EMSANet) with a 3D panoptic NDT map. By decoupling semantic and instance information and performing 2D instance matching with 3D projection, it achieves real-time panoptic mapping on mobile robots, outperforming prior panoptic fusion methods on Hypersim and ScanNetV2. The approach demonstrates strong 3D/2D panoptic quality, resilience to segmentation noise, and practical applicability in real-world domestic robotics, with open-source data preparation and evaluation pipelines. This work advances scene understanding for autonomous indoor navigation by delivering high-detail panoptic maps suitable for real-time decision making.

Abstract

As the application scenarios of mobile robots are getting more complex and challenging, scene understanding becomes increasingly crucial. A mobile robot that is supposed to operate autonomously in indoor environments must have precise knowledge about what objects are present, where they are, what their spatial extent is, and how they can be reached; i.e., information about free space is also crucial. Panoptic mapping is a powerful instrument providing such information. However, building 3D panoptic maps with high spatial resolution is challenging on mobile robots, given their limited computing capabilities. In this paper, we propose PanopticNDT - an efficient and robust panoptic mapping approach based on occupancy normal distribution transform (NDT) mapping. We evaluate our approach on the publicly available datasets Hypersim and ScanNetV2. The results reveal that our approach can represent panoptic information at a higher level of detail than other state-of-the-art approaches while enabling real-time panoptic mapping on mobile robots. Finally, we prove the real-world applicability of PanopticNDT with qualitative results in a domestic application.

PanopticNDT: Efficient and Robust Panoptic Mapping

TL;DR

PanopticNDT introduces an efficient, robust panoptic mapping framework built on occupancy NDT, integrating dense panoptic segmentation (via EMSANet) with a 3D panoptic NDT map. By decoupling semantic and instance information and performing 2D instance matching with 3D projection, it achieves real-time panoptic mapping on mobile robots, outperforming prior panoptic fusion methods on Hypersim and ScanNetV2. The approach demonstrates strong 3D/2D panoptic quality, resilience to segmentation noise, and practical applicability in real-world domestic robotics, with open-source data preparation and evaluation pipelines. This work advances scene understanding for autonomous indoor navigation by delivering high-detail panoptic maps suitable for real-time decision making.

Abstract

As the application scenarios of mobile robots are getting more complex and challenging, scene understanding becomes increasingly crucial. A mobile robot that is supposed to operate autonomously in indoor environments must have precise knowledge about what objects are present, where they are, what their spatial extent is, and how they can be reached; i.e., information about free space is also crucial. Panoptic mapping is a powerful instrument providing such information. However, building 3D panoptic maps with high spatial resolution is challenging on mobile robots, given their limited computing capabilities. In this paper, we propose PanopticNDT - an efficient and robust panoptic mapping approach based on occupancy normal distribution transform (NDT) mapping. We evaluate our approach on the publicly available datasets Hypersim and ScanNetV2. The results reveal that our approach can represent panoptic information at a higher level of detail than other state-of-the-art approaches while enabling real-time panoptic mapping on mobile robots. Finally, we prove the real-world applicability of PanopticNDT with qualitative results in a domestic application.
Paper Structure (25 sections, 12 equations, 6 figures, 6 tables)

This paper contains 25 sections, 12 equations, 6 figures, 6 tables.

Figures (6)

  • Figure 1: Panoptic occupancy NDT (P-NDT) map built with predicted panoptic segmentation of EMSANet emsanet2022ijcnn and voxel size of 10cm for scene ai_051_001 of the Hypersim hypersim-iccv2021 test split. Bottom: corresponding instance map. Best viewed in color at 300%. Black indicates no_instance, see Fig. \ref{['fig:experiments:hypersim_radar_chart']} for semantic colors. Panoptic is visualized by small color differences.
  • Figure 2: Overview of our two-step approach for panoptic mapping.
  • Figure 3: Per-class 2D panoptic quality on the Hypersim test split for EMSANet and when mapping with its predictions and voxel sizes of 10cm and 5cm. Classes printed in gray do not appear in the test split.
  • Figure 4: Qualitative results for scene ai_001_10 of the Hypersim test split (top) and for scene_0757_00 and scene_0761_00 of the hidden ScanNetV2 test split (bottom) when mapping with EMSANet predictions and voxel size 10cm. Left to right: color image and panoptic back-projection for the given camera pose (see 3D scenes), panoptic, panoptic semantic, and panoptic instance NDT map. Best viewed in color at 200%. Black indicates void/no_instance, for the semantic colors, we refer to Fig. \ref{['fig:experiments:hypersim_radar_chart']}. Panoptic labels are visualized by small color differences based on the semantic color.
  • Figure 5: Qualitative results for scene ai_001_10 of the Hypersim test split. The upper part shows the dataset's ground truth as well as the thresholded predictions of EMSANet emsanet2022ijcnn (see Sec. \ref{['sec:experiments:implementation']}). The lower part compares our proposed PanopticNDT with voxel sizes 5cm and 10cm to Panoptic Multi-TSDFs panoptic-multi-tsdf-2022-icra. For each mapping approach, results are visualized for both when mapping with ground truth (top) and when mapping with predicted segmentation of EMSANet (bottom). Best viewed in color at 300%. Black indicates void/no_instance, for the semantic colors, we refer to Fig. \ref{['fig:experiments:hypersim_radar_chart']}. Panoptic labels are visualized by small color differences based on the semantic color.
  • ...and 1 more figures