Table of Contents
Fetching ...

PCICF: A Pedestrian Crossing Identification and Classification Framework

Junyi Gu, Beatriz Cabrero-Daniel, Ali Nouri, Lydia Armini, Christian Berger

TL;DR

PCICF addresses the safety-critical problem of identifying and classifying pedestrian crossing events in autonomous driving by combining a systematically constructed synthetic dictionary (MoreSMIRK) with space-filling-curve fingerprints derived from multi-modal pedestrian tracks. The framework integrates YOLO-based detection, BoT-SORT tracking, and RoI-based filtering to extract crossing sequences, then reduces them to 1D CSP barcodes and matches them to MoreSMIRK entries for semantic classification. Evaluated on the PIE real-world dataset, PCICF achieves up to 85% accuracy for simple uni-directional crossings and provides probabilistic similarity for complex, multi-pedestrian scenarios, highlighting both strengths and current limitations. The work emphasizes computational efficiency and onboard applicability, offering an open-source replication package and a path toward scalable OOD detection and safety validation in urban ADAS/AV systems.

Abstract

We have recently observed the commercial roll-out of robotaxis in various countries. They are deployed within an operational design domain (ODD) on specific routes and environmental conditions, and are subject to continuous monitoring to regain control in safety-critical situations. Since ODDs typically cover urban areas, robotaxis must reliably detect vulnerable road users (VRUs) such as pedestrians, bicyclists, or e-scooter riders. To better handle such varied traffic situations, end-to-end AI, which directly compute vehicle control actions from multi-modal sensor data instead of only for perception, is on the rise. High quality data is needed for systematically training and evaluating such systems within their OOD. In this work, we propose PCICF, a framework to systematically identify and classify VRU situations to support ODD's incident analysis. We base our work on the existing synthetic dataset SMIRK, and enhance it by extending its single-pedestrian-only design into the MoreSMIRK dataset, a structured dictionary of multi-pedestrian crossing situations constructed systematically. We then use space-filling curves (SFCs) to transform multi-dimensional features of scenarios into characteristic patterns, which we match with corresponding entries in MoreSMIRK. We evaluate PCICF with the large real-world dataset PIE, which contains more than 150 manually annotated pedestrian crossing videos. We show that PCICF can successfully identify and classify complex pedestrian crossings, even when groups of pedestrians merge or split. By leveraging computationally efficient components like SFCs, PCICF has even potential to be used onboard of robotaxis for OOD detection for example. We share an open-source replication package for PCICF containing its algorithms, the complete MoreSMIRK dataset and dictionary, as well as our experiment results presented in: https://github.com/Claud1234/PCICF

PCICF: A Pedestrian Crossing Identification and Classification Framework

TL;DR

PCICF addresses the safety-critical problem of identifying and classifying pedestrian crossing events in autonomous driving by combining a systematically constructed synthetic dictionary (MoreSMIRK) with space-filling-curve fingerprints derived from multi-modal pedestrian tracks. The framework integrates YOLO-based detection, BoT-SORT tracking, and RoI-based filtering to extract crossing sequences, then reduces them to 1D CSP barcodes and matches them to MoreSMIRK entries for semantic classification. Evaluated on the PIE real-world dataset, PCICF achieves up to 85% accuracy for simple uni-directional crossings and provides probabilistic similarity for complex, multi-pedestrian scenarios, highlighting both strengths and current limitations. The work emphasizes computational efficiency and onboard applicability, offering an open-source replication package and a path toward scalable OOD detection and safety validation in urban ADAS/AV systems.

Abstract

We have recently observed the commercial roll-out of robotaxis in various countries. They are deployed within an operational design domain (ODD) on specific routes and environmental conditions, and are subject to continuous monitoring to regain control in safety-critical situations. Since ODDs typically cover urban areas, robotaxis must reliably detect vulnerable road users (VRUs) such as pedestrians, bicyclists, or e-scooter riders. To better handle such varied traffic situations, end-to-end AI, which directly compute vehicle control actions from multi-modal sensor data instead of only for perception, is on the rise. High quality data is needed for systematically training and evaluating such systems within their OOD. In this work, we propose PCICF, a framework to systematically identify and classify VRU situations to support ODD's incident analysis. We base our work on the existing synthetic dataset SMIRK, and enhance it by extending its single-pedestrian-only design into the MoreSMIRK dataset, a structured dictionary of multi-pedestrian crossing situations constructed systematically. We then use space-filling curves (SFCs) to transform multi-dimensional features of scenarios into characteristic patterns, which we match with corresponding entries in MoreSMIRK. We evaluate PCICF with the large real-world dataset PIE, which contains more than 150 manually annotated pedestrian crossing videos. We show that PCICF can successfully identify and classify complex pedestrian crossings, even when groups of pedestrians merge or split. By leveraging computationally efficient components like SFCs, PCICF has even potential to be used onboard of robotaxis for OOD detection for example. We share an open-source replication package for PCICF containing its algorithms, the complete MoreSMIRK dataset and dictionary, as well as our experiment results presented in: https://github.com/Claud1234/PCICF

Paper Structure

This paper contains 20 sections, 4 figures, 7 tables, 1 algorithm.

Figures (4)

  • Figure 1: Overall workflow of PCICF: The raw camera input is a sequence from the PIE dataset, and the dashed rectangles represent the modules detailed in Section \ref{['sec:pcicf']}. Dark-green boxes at the top indicate the datasets and algorithms used in each module. Finally, four similarity checks in crossing event analysis out of the 104 entries from MoreSMIRK are shown to obtain semantic descriptions for a crossing event.
  • Figure 2: The illustration of configuration principles to generate the MoreSMIRK dataset. The red boxes indicate the locations of regions of interest (RoI) in the dataset.
  • Figure 3: Data dimensionality reduction made with our AutoSFC tool$^2$ (same PIE dataset sequence as in Fig. \ref{['fig_1:diagram']}): (a) shows the activation of the six RoIs (i.e., red boxes in Fig. \ref{['fig_2:moresmirk']}), and (b) depicts the corresponding single-dimensional representation of the 6D-RoIs over time; the vertical stripes represent the crossing-specific fingerprint to be matched within MoreSMIRK.
  • Figure 4: Misclassifications of single-directional crossing for single pedestrian: The ground truth for (a) and (b) are '_ _ X; N/A; _ _ _' and '_ _ _; N/A; Y _ _', respectively.