Pedestrian intention prediction in Adverse Weather Conditions with Spiking Neural Networks and Dynamic Vision Sensors

Mustafa Sakhai; Szymon Mazurek; Jakub Caputa; Jan K. Argasiński; Maciej Wielgosz

Pedestrian intention prediction in Adverse Weather Conditions with Spiking Neural Networks and Dynamic Vision Sensors

Mustafa Sakhai, Szymon Mazurek, Jakub Caputa, Jan K. Argasiński, Maciej Wielgosz

TL;DR

The paper tackles robust pedestrian detection and intention prediction under adverse weather by integrating Spiking Neural Networks (SNNs) with Dynamic Vision Sensors (DVS) and evaluating them against CNN baselines using a CARLA-generated dataset. It introduces PLIF-based SNNs with surrogate gradient training, augmented by ResNet adaptations and Temporally Effective Batch Normalization (TEBN), and analyzes two tasks: clip-level detection and horizon-based prediction, across weather-variant scenes. Key findings show that SNNs with DVS achieve superior energy efficiency and competitive or superior accuracy in difficult conditions, especially as temporal windows widen, while RGB-based CNNs remain stronger in clear weather. The work suggests a hybrid deployment strategy that leverages DVS+SNN for adverse conditions and RGB+CNN for normal conditions, potentially enhancing safety and efficiency in autonomous-vehicle perception systems. The study provides publicly available code and datasets to foster reproducibility and future exploration of DVS-enabled neuromorphic perception in intelligent transportation systems.

Abstract

This study examines the effectiveness of Spiking Neural Networks (SNNs) paired with Dynamic Vision Sensors (DVS) to improve pedestrian detection in adverse weather, a significant challenge for autonomous vehicles. Utilizing the high temporal resolution and low latency of DVS, which excels in dynamic, low-light, and high-contrast environments, we assess the efficiency of SNNs compared to traditional Convolutional Neural Networks (CNNs). Our experiments involved testing across diverse weather scenarios using a custom dataset from the CARLA simulator, mirroring real-world variability. SNN models, enhanced with Temporally Effective Batch Normalization, were trained and benchmarked against state-of-the-art CNNs to demonstrate superior accuracy and computational efficiency in complex conditions such as rain and fog. The results indicate that SNNs, integrated with DVS, significantly reduce computational overhead and improve detection accuracy in challenging conditions compared to CNNs. This highlights the potential of DVS combined with bio-inspired SNN processing to enhance autonomous vehicle perception and decision-making systems, advancing intelligent transportation systems' safety features in varying operational environments. Additionally, our research indicates that SNNs perform more efficiently in handling long perception windows and prediction tasks, rather than simple pedestrian detection.

Pedestrian intention prediction in Adverse Weather Conditions with Spiking Neural Networks and Dynamic Vision Sensors

TL;DR

Abstract

Paper Structure (25 sections, 13 equations, 3 figures, 6 tables)

This paper contains 25 sections, 13 equations, 3 figures, 6 tables.

Introduction
Related work
Background
Spiking Neural Networks and neuron models
Basic principles of spiking neurons
Neuron model
Surrogate gradient training of spiking neural networks
ResNet architecture adaptations
Network readout
CARLA Simulator and Perception System
Dynamic Vision Sensor (DVS)
Dataset generation
Experimental Setup for Network Training
Tasks and Data Processing
Clip Classification
...and 10 more sections

Figures (3)

Figure 1: Example frames from the dataset. The first row shows samples from the good weather subset, and the second row shows samples from the bad weather subset. The first column displays images captured by the DVS camera, while the second column shows the corresponding RGB images. Each DVS-RGB pair represents the same frame within a given subset. A pedestrian is present in the scene in both of the presented frames. Note that in the bad weather frames, the pedestrian is barely visible to the human observer in the RGB images on the right.
Figure 2: Examples of incorrectly classified frames evaluated on bad weather subset of JAAD dataset. The predictions were made by the SPS R18T model in a single-frame prediction task. In those examples it is visible that the labels assigned to the given frame do not precisely match the observed pedestrian behavior, therefore leading to the prediction being considered incorrect.
Figure 3: Comparison of original size DVS image with the pedestrian crossing the street in bad weather with the same frame after interpolation. The top image is the original frame, the grid below shows the interpolation results for different methods. Top left: bilinear, top right: nearest neighbor, bottom left: areal, bottom right: bicubic. Note that after the interpolation the image quality degrades significantly, making the visibility of the pedestrian much lower for the human observer.

Pedestrian intention prediction in Adverse Weather Conditions with Spiking Neural Networks and Dynamic Vision Sensors

TL;DR

Abstract

Pedestrian intention prediction in Adverse Weather Conditions with Spiking Neural Networks and Dynamic Vision Sensors

Authors

TL;DR

Abstract

Table of Contents

Figures (3)