Low Latency Instance Segmentation by Continuous Clustering for LiDAR Sensors
Andreas Reich, Mirko Maehlisch
TL;DR
This work tackles the challenge of low-latency LiDAR instance segmentation by introducing continuous clustering, which processes a LiDAR range image as an infinitely expanding stream and outputs complete object instances online during rotation. The method employs a two-layer data structure (point trees at the bottom and a high-level graph on top) to incrementally connect nearby points within a threshold $d_T$, and a cluster-generation step that performs connected-component labeling as new columns arrive, publishing clusters as soon as $\varphi_{\text{cont, rear}} > \varphi_{\text{finished},v}$. Key contributions include the real-time, column-by-column processing pipeline, the ability to handle multi-axis and solid-state LiDARs, and two real-time heuristics to boost throughput without sacrificing core correctness. Empirical results on SemanticKitti show an average latency of $5~\mathrm{ms}$ (with $\sigma = 8~\mathrm{ms}$) on a 128-laser Velodyne in urban scenes, demonstrating substantial latency reduction while maintaining competitive segmentation performance; the approach also enables a larger field of view for challenging occluded scenarios, with potential benefits for downstream tasks such as tracking and SLAM.
Abstract
Low-latency instance segmentation of LiDAR point clouds is crucial in real-world applications because it serves as an initial and frequently-used building block in a robot's perception pipeline, where every task adds further delay. Particularly in dynamic environments, this total delay can result in significant positional offsets of dynamic objects, as seen in highway scenarios. To address this issue, we employ a new technique, which we call continuous clustering. Unlike most existing clustering approaches, which use a full revolution of the LiDAR sensor, we process the data stream in a continuous and seamless fashion. Our approach does not rely on the concept of complete or partial sensor rotations with multiple discrete range images; instead, it views the range image as a single and infinitely horizontally growing entity. Each new column of this continuous range image is processed as soon it is available. Obstacle points are clustered to existing instances in real-time and it is checked at a high-frequency which instances are completed in order to publish them without waiting for the completion of the revolution or some other integration period. In the case of rotating sensors, no problematic discontinuities between the points of the end and the start of a scan are observed. In this work we describe the two-layered data structure and the corresponding algorithm for continuous clustering. It is able to achieve an average latency of just 5 ms with respect to the latest timestamp of all points in the cluster. We are publishing the source code at https://github.com/UniBwTAS/continuous_clustering.
