Table of Contents
Fetching ...

Panoptic nuScenes: A Large-Scale Benchmark for LiDAR Panoptic Segmentation and Tracking

Whye Kit Fong, Rohit Mohan, Juana Valeria Hurtado, Lubing Zhou, Holger Caesar, Oscar Beijbom, Abhinav Valada

TL;DR

Panoptic nuScenes introduces a large-scale LiDAR benchmark with point-level semantic, panoptic, and tracking annotations across 1000 diverse urban scenes, enabling robust evaluation of dynamic scene understanding. It extends nuScenes with temporally consistent instance IDs and proposes a new instance-centric PAT metric that jointly evaluates segmentation and tracking while penalizing track fragmentation and ID switches. The authors provide extensive baselines, ablations, and cross-dataset analyses, demonstrating the dataset’s utility and revealing gaps in current end-to-end approaches. The online evaluation server and comprehensive experiments aim to accelerate research in LiDAR-based panoptic understanding for autonomous driving in complex urban environments.

Abstract

Panoptic scene understanding and tracking of dynamic agents are essential for robots and automated vehicles to navigate in urban environments. As LiDARs provide accurate illumination-independent geometric depictions of the scene, performing these tasks using LiDAR point clouds provides reliable predictions. However, existing datasets lack diversity in the type of urban scenes and have a limited number of dynamic object instances which hinders both learning of these tasks as well as credible benchmarking of the developed methods. In this paper, we introduce the large-scale Panoptic nuScenes benchmark dataset that extends our popular nuScenes dataset with point-wise groundtruth annotations for semantic segmentation, panoptic segmentation, and panoptic tracking tasks. To facilitate comparison, we provide several strong baselines for each of these tasks on our proposed dataset. Moreover, we analyze the drawbacks of the existing metrics for panoptic tracking and propose the novel instance-centric PAT metric that addresses the concerns. We present exhaustive experiments that demonstrate the utility of Panoptic nuScenes compared to existing datasets and make the online evaluation server available at nuScenes.org. We believe that this extension will accelerate the research of novel methods for scene understanding of dynamic urban environments.

Panoptic nuScenes: A Large-Scale Benchmark for LiDAR Panoptic Segmentation and Tracking

TL;DR

Panoptic nuScenes introduces a large-scale LiDAR benchmark with point-level semantic, panoptic, and tracking annotations across 1000 diverse urban scenes, enabling robust evaluation of dynamic scene understanding. It extends nuScenes with temporally consistent instance IDs and proposes a new instance-centric PAT metric that jointly evaluates segmentation and tracking while penalizing track fragmentation and ID switches. The authors provide extensive baselines, ablations, and cross-dataset analyses, demonstrating the dataset’s utility and revealing gaps in current end-to-end approaches. The online evaluation server and comprehensive experiments aim to accelerate research in LiDAR-based panoptic understanding for autonomous driving in complex urban environments.

Abstract

Panoptic scene understanding and tracking of dynamic agents are essential for robots and automated vehicles to navigate in urban environments. As LiDARs provide accurate illumination-independent geometric depictions of the scene, performing these tasks using LiDAR point clouds provides reliable predictions. However, existing datasets lack diversity in the type of urban scenes and have a limited number of dynamic object instances which hinders both learning of these tasks as well as credible benchmarking of the developed methods. In this paper, we introduce the large-scale Panoptic nuScenes benchmark dataset that extends our popular nuScenes dataset with point-wise groundtruth annotations for semantic segmentation, panoptic segmentation, and panoptic tracking tasks. To facilitate comparison, we provide several strong baselines for each of these tasks on our proposed dataset. Moreover, we analyze the drawbacks of the existing metrics for panoptic tracking and propose the novel instance-centric PAT metric that addresses the concerns. We present exhaustive experiments that demonstrate the utility of Panoptic nuScenes compared to existing datasets and make the online evaluation server available at nuScenes.org. We believe that this extension will accelerate the research of novel methods for scene understanding of dynamic urban environments.

Paper Structure

This paper contains 30 sections, 3 equations, 12 figures, 12 tables.

Figures (12)

  • Figure 1: LiDAR scans from the Panoptic nuScenes dataset showing semantic segmentation labels with point-wise semantic class annotations, panoptic segmentation labels with additional instance IDs, and panoptic tracking labels with additional temporally consistent instance IDs. Note that in panoptic tracking, the same instances in consecutive scans have the same color indicating tracking.
  • Figure 2: (Bottom) Number of LiDAR points for each class in Panoptic nuScenes. (Top) Corresponding number of scan-wise instances for each thing class.
  • Figure 3: Number of scan-wise moving instances in SemanticKITTI and Panoptic nuScenes. We only show bicycles/motorcycles with a rider as a proxy for the moving attribute. We exclude the rare on-rails and other_vehicle classes.
  • Figure 4: Analysis of how the PAT metric measures the performance with different combinations of LiDAR semantic segmentation and tracking methods. Each dot represents a single combination. A total of 924 combinations are shown.
  • Figure S.1: Front camera view of panoptic annotation examples, including construction zones (row 1), junctions (row 2), nighttime (row 3 left) and bendy bus (row 3 right). We can see that the annotations accurately outline vehicle wheels, rather than including nearby ground points (row 4).
  • ...and 7 more figures