Table of Contents
Fetching ...

InScope: A New Real-world 3D Infrastructure-side Collaborative Perception Dataset for Open Traffic Scenarios

Xiaofei Zhang, Yining Li, Jinping Wang, Xiangyi Qin, Ying Shen, Zhengping Fan, Xiaojun Tan

TL;DR

<3-5 sentence high-level summary> The paper addresses occlusion challenges in vehicle-centric perception by proposing infrastructure-side collaborative perception (IPS) and introducing InScope, a real-world, multi-position LiDAR dataset designed to mitigate occlusion in open traffic. It establishes four benchmarks—collaborative 3D object detection, multisource data fusion, data-domain transfer, and 3D multiobject tracking—and introduces a novel anti-occlusion metric xi_D to quantify detection degradation under occlusion. Through comprehensive experiments with standard detectors and fusion strategies, InScope demonstrates improved detection and tracking of occluded, small, and distant objects and shows favorable domain-transfer properties. The dataset and code are publicly available to accelerate V2X research and the development of occlusion-resilient autonomous perception.

Abstract

Perception systems of autonomous vehicles are susceptible to occlusion, especially when examined from a vehicle-centric perspective. Such occlusion can lead to overlooked object detections, e.g., larger vehicles such as trucks or buses may create blind spots where cyclists or pedestrians could be obscured, accentuating the safety concerns associated with such perception system limitations. To mitigate these challenges, the vehicle-to-everything (V2X) paradigm suggests employing an infrastructure-side perception system (IPS) to complement autonomous vehicles with a broader perceptual scope. Nevertheless, the scarcity of real-world 3D infrastructure-side datasets constrains the advancement of V2X technologies. To bridge these gaps, this paper introduces a new 3D infrastructure-side collaborative perception dataset, abbreviated as inscope. Notably, InScope is the first dataset dedicated to addressing occlusion challenges by strategically deploying multiple-position Light Detection and Ranging (LiDAR) systems on the infrastructure side. Specifically, InScope encapsulates a 20-day capture duration with 303 tracking trajectories and 187,787 3D bounding boxes annotated by experts. Through analysis of benchmarks, four different benchmarks are presented for open traffic scenarios, including collaborative 3D object detection, multisource data fusion, data domain transfer, and 3D multiobject tracking tasks. Additionally, a new metric is designed to quantify the impact of occlusion, facilitating the evaluation of detection degradation ratios among various algorithms. The Experimental findings showcase the enhanced performance of leveraging InScope to assist in detecting and tracking 3D multiobjects in real-world scenarios, particularly in tracking obscured, small, and distant objects. The dataset and benchmarks are available at https://github.com/xf-zh/InScope.

InScope: A New Real-world 3D Infrastructure-side Collaborative Perception Dataset for Open Traffic Scenarios

TL;DR

<3-5 sentence high-level summary> The paper addresses occlusion challenges in vehicle-centric perception by proposing infrastructure-side collaborative perception (IPS) and introducing InScope, a real-world, multi-position LiDAR dataset designed to mitigate occlusion in open traffic. It establishes four benchmarks—collaborative 3D object detection, multisource data fusion, data-domain transfer, and 3D multiobject tracking—and introduces a novel anti-occlusion metric xi_D to quantify detection degradation under occlusion. Through comprehensive experiments with standard detectors and fusion strategies, InScope demonstrates improved detection and tracking of occluded, small, and distant objects and shows favorable domain-transfer properties. The dataset and code are publicly available to accelerate V2X research and the development of occlusion-resilient autonomous perception.

Abstract

Perception systems of autonomous vehicles are susceptible to occlusion, especially when examined from a vehicle-centric perspective. Such occlusion can lead to overlooked object detections, e.g., larger vehicles such as trucks or buses may create blind spots where cyclists or pedestrians could be obscured, accentuating the safety concerns associated with such perception system limitations. To mitigate these challenges, the vehicle-to-everything (V2X) paradigm suggests employing an infrastructure-side perception system (IPS) to complement autonomous vehicles with a broader perceptual scope. Nevertheless, the scarcity of real-world 3D infrastructure-side datasets constrains the advancement of V2X technologies. To bridge these gaps, this paper introduces a new 3D infrastructure-side collaborative perception dataset, abbreviated as inscope. Notably, InScope is the first dataset dedicated to addressing occlusion challenges by strategically deploying multiple-position Light Detection and Ranging (LiDAR) systems on the infrastructure side. Specifically, InScope encapsulates a 20-day capture duration with 303 tracking trajectories and 187,787 3D bounding boxes annotated by experts. Through analysis of benchmarks, four different benchmarks are presented for open traffic scenarios, including collaborative 3D object detection, multisource data fusion, data domain transfer, and 3D multiobject tracking tasks. Additionally, a new metric is designed to quantify the impact of occlusion, facilitating the evaluation of detection degradation ratios among various algorithms. The Experimental findings showcase the enhanced performance of leveraging InScope to assist in detecting and tracking 3D multiobjects in real-world scenarios, particularly in tracking obscured, small, and distant objects. The dataset and benchmarks are available at https://github.com/xf-zh/InScope.
Paper Structure (30 sections, 3 equations, 11 figures, 11 tables)

This paper contains 30 sections, 3 equations, 11 figures, 11 tables.

Figures (11)

  • Figure 1: InScope overview. The blue and yellow dashed lines indicate the detection areas of the principal and secondary LiDAR systems, respectively, installed on the points of the blue and yellow pentagrams. Specific examples of fusion points are given at the bottom.
  • Figure 2: The temporal consistency among two LiDARs and the computer clock.
  • Figure 3: Distribution of objects with different ratios of BBS. The subfigure reports the distribution of object counts per frame.
  • Figure 4: Real-world occlusion examples from the InScope-Pri and InScope datasets are depicted. Red boxes indicate objects occluded in blind-spots (grey regions), which cannot be detected by the InScope-Pri dataset.
  • Figure 5: The detection results of the CenterPoint method on the InScope-Pri and InScoperi datasets. The red and blue bounding boxes represent the ground truth and detection results, respectively. The green circles identify the result differences of CenterPoint based on different datasets.
  • ...and 6 more figures