Table of Contents
Fetching ...

PointSSC: A Cooperative Vehicle-Infrastructure Point Cloud Benchmark for Semantic Scene Completion

Yuxiang Yan, Boda Liu, Jianfei Ai, Qinbu Li, Ru Wan, Jian Pu

TL;DR

The first cooperative vehicle-infrastructure point cloud benchmark for semantic scene completion is introduced, an automated annotation pipeline leveraging Semantic Segment Anything to efficiently assign semantics is developed, and a LiDAR-based model with a Spatial-Aware Transformer is proposed.

Abstract

Semantic Scene Completion (SSC) aims to jointly generate space occupancies and semantic labels for complex 3D scenes. Most existing SSC models focus on volumetric representations, which are memory-inefficient for large outdoor spaces. Point clouds provide a lightweight alternative but existing benchmarks lack outdoor point cloud scenes with semantic labels. To address this, we introduce PointSSC, the first cooperative vehicle-infrastructure point cloud benchmark for semantic scene completion. These scenes exhibit long-range perception and minimal occlusion. We develop an automated annotation pipeline leveraging Semantic Segment Anything to efficiently assign semantics. To benchmark progress, we propose a LiDAR-based model with a Spatial-Aware Transformer for global and local feature extraction and a Completion and Segmentation Cooperative Module for joint completion and segmentation. PointSSC provides a challenging testbed to drive advances in semantic point cloud completion for real-world navigation. The code and datasets are available at https://github.com/yyxssm/PointSSC.

PointSSC: A Cooperative Vehicle-Infrastructure Point Cloud Benchmark for Semantic Scene Completion

TL;DR

The first cooperative vehicle-infrastructure point cloud benchmark for semantic scene completion is introduced, an automated annotation pipeline leveraging Semantic Segment Anything to efficiently assign semantics is developed, and a LiDAR-based model with a Spatial-Aware Transformer is proposed.

Abstract

Semantic Scene Completion (SSC) aims to jointly generate space occupancies and semantic labels for complex 3D scenes. Most existing SSC models focus on volumetric representations, which are memory-inefficient for large outdoor spaces. Point clouds provide a lightweight alternative but existing benchmarks lack outdoor point cloud scenes with semantic labels. To address this, we introduce PointSSC, the first cooperative vehicle-infrastructure point cloud benchmark for semantic scene completion. These scenes exhibit long-range perception and minimal occlusion. We develop an automated annotation pipeline leveraging Semantic Segment Anything to efficiently assign semantics. To benchmark progress, we propose a LiDAR-based model with a Spatial-Aware Transformer for global and local feature extraction and a Completion and Segmentation Cooperative Module for joint completion and segmentation. PointSSC provides a challenging testbed to drive advances in semantic point cloud completion for real-world navigation. The code and datasets are available at https://github.com/yyxssm/PointSSC.
Paper Structure (21 sections, 5 equations, 5 figures, 3 tables)

This paper contains 21 sections, 5 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: PointSSC Overview. Given infrastructure-side partial points and images (top left), we first couple them with vehicle-side point clouds (bottom left) to construct the PointSSC dataset (bottom right). PointSSC then guides our network (top right) for point cloud semantic scene completion. The blue background indicates the PointSSC generation pipeline, while the brown dashed box shows model prediction.
  • Figure 2: Pipeline of our PointSSC dataset generation. For infrastructure-side images, we annotate their semantic labels. For vehicle-infrastructure cooperative point clouds, we use ground truth bounding boxes to separate static scenes and dynamic objects. For static scenes, we concatenate multi-frame static scenes together and annotate 2D semantic labels to 3D points to get semantic static scenes. For dynamic objects, we use a multi-view, multi-object completion strategy to complete them. Finally, we concatenate semantic static scenes and dynamic objects together. ⓒ denotes the concatenate operation.
  • Figure 3: Pipeline of our PointSSC baseline. Given partial point clouds, we use PointNet++c13 to extract point proxies, then fuse local and global features through a spatial-aware transformer. We use a proxy generator to get coarse-up sampled point proxies and CSCM to generate complete semantic points through a coarse-to-fine strategy.
  • Figure 4: Two types of data division. The first one is split by time, the second one is split by scenes.
  • Figure 5: Visualization comparison of our model (PointSSC) with other point completion methods.