Table of Contents
Fetching ...

DHP-Mapping: A Dense Panoptic Mapping System with Hierarchical World Representation and Label Optimization Techniques

Tianshuai Hu, Jianhao Jiao, Yucheng Xu, Hongji Liu, Sheng Wang, Ming Liu

TL;DR

DHP-Mapping tackles dense 3D semantic mapping by building a hierarchical scene representation composed of multiple TSDF submaps, each carrying both geometry and panoptic labels. It introduces inter-submap label fusion to eliminate overlaps and a multi-variable CRF with mean-field inference to enforce label consistency using color and geometric cues, formalized via a Gibbs energy $E(X_S,X_I|D)$ and Gaussian kernels $K(y,z)$. Experimental results on indoor RGB-D and outdoor SemanticKITTI datasets demonstrate competitive geometry and label accuracy against state-of-the-art baselines, while the hierarchical submap structure provides improved scalability and data retrieval. The work is open-source, enabling reproducibility and facilitating higher-level interactive tasks in dynamic environments.

Abstract

Maps provide robots with crucial environmental knowledge, thereby enabling them to perform interactive tasks effectively. Easily accessing accurate abstract-to-detailed geometric and semantic concepts from maps is crucial for robots to make informed and efficient decisions. To comprehensively model the environment and effectively manage the map data structure, we propose DHP-Mapping, a dense mapping system that utilizes multiple Truncated Signed Distance Field (TSDF) submaps and panoptic labels to hierarchically model the environment. The output map is able to maintain both voxel- and submap-level metric and semantic information. Two modules are presented to enhance the mapping efficiency and label consistency: (1) an inter-submaps label fusion strategy to eliminate duplicate points across submaps and (2) a conditional random field (CRF) based approach to enhance panoptic labels through object label comprehension and contextual information. We conducted experiments with two public datasets including indoor and outdoor scenarios. Our system performs comparably to state-of-the-art (SOTA) methods across geometry and label accuracy evaluation metrics. The experiment results highlight the effectiveness and scalability of our system, as it is capable of constructing precise geometry and maintaining consistent panoptic labels. Our code is publicly available at https://github.com/hutslib/DHP-Mapping.

DHP-Mapping: A Dense Panoptic Mapping System with Hierarchical World Representation and Label Optimization Techniques

TL;DR

DHP-Mapping tackles dense 3D semantic mapping by building a hierarchical scene representation composed of multiple TSDF submaps, each carrying both geometry and panoptic labels. It introduces inter-submap label fusion to eliminate overlaps and a multi-variable CRF with mean-field inference to enforce label consistency using color and geometric cues, formalized via a Gibbs energy and Gaussian kernels . Experimental results on indoor RGB-D and outdoor SemanticKITTI datasets demonstrate competitive geometry and label accuracy against state-of-the-art baselines, while the hierarchical submap structure provides improved scalability and data retrieval. The work is open-source, enabling reproducibility and facilitating higher-level interactive tasks in dynamic environments.

Abstract

Maps provide robots with crucial environmental knowledge, thereby enabling them to perform interactive tasks effectively. Easily accessing accurate abstract-to-detailed geometric and semantic concepts from maps is crucial for robots to make informed and efficient decisions. To comprehensively model the environment and effectively manage the map data structure, we propose DHP-Mapping, a dense mapping system that utilizes multiple Truncated Signed Distance Field (TSDF) submaps and panoptic labels to hierarchically model the environment. The output map is able to maintain both voxel- and submap-level metric and semantic information. Two modules are presented to enhance the mapping efficiency and label consistency: (1) an inter-submaps label fusion strategy to eliminate duplicate points across submaps and (2) a conditional random field (CRF) based approach to enhance panoptic labels through object label comprehension and contextual information. We conducted experiments with two public datasets including indoor and outdoor scenarios. Our system performs comparably to state-of-the-art (SOTA) methods across geometry and label accuracy evaluation metrics. The experiment results highlight the effectiveness and scalability of our system, as it is capable of constructing precise geometry and maintaining consistent panoptic labels. Our code is publicly available at https://github.com/hutslib/DHP-Mapping.
Paper Structure (14 sections, 5 equations, 6 figures, 4 tables)

This paper contains 14 sections, 5 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: DHP-Mapping portrays the scenario through a collection of TSDF submaps, each submap represents either a foreground thing or background stuff together with semantic class and instance ID.
  • Figure 2: Our system converts sensor data, poses, and panoptic segmentation into point segments. (A): A data association process utilizes semantic categories and 3D IoU metrics to assign a submap ID to each segment. (B): Segment information is integrated into the assigned submap's TSDF and label layers through ray-tracing. (C): To prevent submap overlap, voxels occupying the same space are identified and their information is fused, ensuring each submap exclusively represents an object. (D): A CRF algorithm enhances the precision of semantic and instance labels by encouraging label consistency among voxels exhibiting similar color and geometric features.
  • Figure 3: To query a point in the world space, the first step is to identify its corresponding submap via submap bounding volumes. Then a hash function can be employed to query the TSDF and label voxels within each layer separately.
  • Figure 4: The left part of this figure shows bundled ray-tracing of points in a segment. The segment is classified with semantic class car and is associated with submap A. It also illustrates that a spatial position can be covered by more than one submap as indicated by voxels marked green. The orange tables details the update process for a voxel in submap A by the bundled ray. Specifically, the semantic count of car and instance ID count of A are both increased by one. The blue tables present the semantic and instance records for a voxel within submap B, which occupies the same spatial position as the voxel in submap A. The green tables display the label information for this spatial position, showcasing merged data results from submap A and B. It's important to note that the merged label will be solely stored in a voxel of submap A after the process described in \ref{['sec:mapmanagement']}.
  • Figure 5: Visualization results of dense panoptic mapping systems run on flat and semanticKITTI datasets. Meshes are extracted from the TSDF map using the marching cubes algorithm. The first line displays our system's map reconstruction results using the color values stored in their TSDF layers. The second to the fifth lines show the maps with label results. Different colors in each sub-figure represent unique object IDs. Compared with Panmap, our DHP-Mapping produces more consistent labels and reconstructs denser and more accurate geometry (obvious in columns d-e-f). The proposed refinement techniques help to reduce submap overlaps and enhance label accuracy, without which submaps tend to intertwine with each other (highlighted by black circles).
  • ...and 1 more figures