HeLiMOS: A Dataset for Moving Object Segmentation in 3D Point Clouds From Heterogeneous LiDAR Sensors

Hyungtae Lim; Seoyeon Jang; Benedikt Mersch; Jens Behley; Hyun Myung; Cyrill Stachniss

HeLiMOS: A Dataset for Moving Object Segmentation in 3D Point Clouds From Heterogeneous LiDAR Sensors

Hyungtae Lim, Seoyeon Jang, Benedikt Mersch, Jens Behley, Hyun Myung, Cyrill Stachniss

TL;DR

This work addresses MOS in 3D point clouds captured by heterogeneous LiDAR sensors by introducing HeLiMOS, a dataset with MOS labels for four sensor types (both solid-state and omnidirectional). It couples an instance-aware automatic labeling pipeline—combining topology-based pose correction, ERASOR2-based instance-aware annotation, and tracking-based filtering—with human refinement to efficiently produce high-quality MOS labels. Extensive evaluations using $IoU_{MOS}$ demonstrate that training on HeLiMOS improves cross-sensor generalization and reveals the need for sensor-agnostic MOS methods, while the labeling framework significantly reduces manual annotation effort. Overall, HeLiMOS provides a benchmark and methodology for robust MOS and static map-building across heterogeneous LiDARs, advancing practical deployment in multi-sensor autonomous systems.

Abstract

Moving object segmentation (MOS) using a 3D light detection and ranging (LiDAR) sensor is crucial for scene understanding and identification of moving objects. Despite the availability of various types of 3D LiDAR sensors in the market, MOS research still predominantly focuses on 3D point clouds from mechanically spinning omnidirectional LiDAR sensors. Thus, we are, for example, lacking a dataset with MOS labels for point clouds from solid-state LiDAR sensors which have irregular scanning patterns. In this paper, we present a labeled dataset, called \textit{HeLiMOS}, that enables to test MOS approaches on four heterogeneous LiDAR sensors, including two solid-state LiDAR sensors. Furthermore, we introduce a novel automatic labeling method to substantially reduce the labeling effort required from human annotators. To this end, our framework exploits an instance-aware static map building approach and tracking-based false label filtering. Finally, we provide experimental results regarding the performance of commonly used state-of-the-art MOS approaches on HeLiMOS that suggest a new direction for a sensor-agnostic MOS, which generally works regardless of the type of LiDAR sensors used to capture 3D point clouds. Our dataset is available at https://sites.google.com/view/helimos.

HeLiMOS: A Dataset for Moving Object Segmentation in 3D Point Clouds From Heterogeneous LiDAR Sensors

TL;DR

demonstrate that training on HeLiMOS improves cross-sensor generalization and reveals the need for sensor-agnostic MOS methods, while the labeling framework significantly reduces manual annotation effort. Overall, HeLiMOS provides a benchmark and methodology for robust MOS and static map-building across heterogeneous LiDARs, advancing practical deployment in multi-sensor autonomous systems.

Abstract

Paper Structure (13 sections, 2 equations, 10 figures, 5 tables)

This paper contains 13 sections, 2 equations, 10 figures, 5 tables.

Introduction
Related Work
Instance-Aware Automatic Labeling and Data Statistics
Topology-Based Trajectory Clustering and Submap-Based Pose Correction
Instance-Aware Initial Data Annotation
Multi-Object Tracking-Based False Label Filtering and Human Refinement
Data Statistics and File Structure
Evaluation of Moving Object Segmentation and Static Map Building
Experimental Setup
Moving Object Segmentation Performance Against Environmental Changes and LiDAR Sensor Variations
Moving Object Segmentation Performance Across Heterogeneous LiDAR Sensors
Automatic Labeling Performance
Conclusion

Figures (10)

Figure 1: Qualitative examples of our dataset, called HeLiMOS. Our dataset provides point-wise moving object segmentation (MOS) annotations for point clouds acquired by heterogeneous 3D LiDAR sensors from the HeLiPR dataset jung2023helipr. Red points indicate the annotated points from moving objects (best viewed in color).
Figure 2: Examples of moving objects in our dataset, which are shown as red points. From top to bottom, these examples show the zoomed point clouds captured by Aeva Aeries II, Livox Avia, Ouster OS2-128, and Velodyne VLP-16. Note that even though the same objects are shown, they have different patterns owing to the difference in scanning techniques and field of views of the sensors. MOS labels of (a) a bicyclist and pedestrian, (b) crowded pedestrians, (c) a car, and (d) a truck (best viewed in color).
Figure 3: Overview of our merging-and-splitting-based labeling framework. (a) Synchronization of the point clouds from the four LiDAR sensors at a software level. (b)-(d) Procedure of our proposed automatic labeling framework. (b) First, trajectories are segmented into multiple clusters. (c) For each trajectory cluster $\mathcal{C}$, we apply an instance-aware static map building, ERASOR2 lim2023erasor2, that produces initial scan-wise annotated labels. (d) Tracking-based false label filtering is applied to reduce false positive and false negative MOS labels. (e) Next, these labels are manually corrected under human supervision. (f)-(g) Finally, the refined labels of synced scans are backpropagated to individual point clouds, which is denoted by $\pi^{-1}(\cdot)$. Red points indicate the annotated dynamic points (best viewed in color).
Figure 4: (a)-(c) Procedure of our topology-based trajectory clustering. Black trajectory indicates unclustered frames and each color represents a different cluster (best viewed in color). (a) First, intersections are prioritized because these scenes are highly likely to have multiple revisits. (b) Next, the frames from revisited places that are not intersections and consecutive frames without revisits but with sufficiently large intervals are clustered respectively, as indicated by the black dashed circles. (c) Each unclustered frame is merged into the adjacent cluster with the closest frame interval. (d) Frames included in Cluster A, which is indicated in (c), visualized along the time step axis. As a result of the clustering, several sets of consecutive frames are clustered together.
Figure 5: (a)-(c) The annotation results in our proposed labeling framework. Red points denote the annotated dynamic points, while gray points represent points estimated to be static (best viewed in color). (a) The initial result obtained by using ERASOR2. (b) Refined annotation through our tracking-based filtering. Orange dashed circles indicate that false positive points are successfully rejected. (c) Final annotation after human supervision. Purple dashed circles highlight the refined areas by a human labeler.
...and 5 more figures

HeLiMOS: A Dataset for Moving Object Segmentation in 3D Point Clouds From Heterogeneous LiDAR Sensors

TL;DR

Abstract

HeLiMOS: A Dataset for Moving Object Segmentation in 3D Point Clouds From Heterogeneous LiDAR Sensors

Authors

TL;DR

Abstract

Table of Contents

Figures (10)