NU-AIR -- A Neuromorphic Urban Aerial Dataset for Detection and Localization of Pedestrians and Vehicles
Craig Iaboni, Thomas Kelly, Pramod Abichandani
TL;DR
NU-AIR delivers an open-source neuromorphic aerial dataset for urban pedestrian and vehicle detection, captured with a drone-mounted event camera across daylight and night conditions and organized into 283 15-second clips with 93,204 bounding-box annotations. The authors evaluate ten frame-based DNNs, three SNNs with voxel-cube encoding, and a Recurrent Vision Transformer, reporting COCO-style mAP and latency metrics, and provide open-source code for voxelization and model training. They find frame-based DNNs generally outperform SNNs on NU-AIR, while ablation studies reveal how depth, bias, pooling, and normalization affect SNN performance, and they demonstrate a fast RVT baseline with strong latency characteristics. Limitations include evaluations on GPUs rather than edge neuromorphic hardware, data from a single city, and drone-induced artifacts, with future work aimed at multi-city data, multi-modal sensing, and segmentation tasks. Overall, NU-AIR offers a valuable benchmark for neuromorphic urban perception and informs practical deployment considerations for aerial, event-based vision systems.
Abstract
This paper presents an open-source aerial neuromorphic dataset that captures pedestrians and vehicles moving in an urban environment. The dataset, titled NU-AIR, features 70.75 minutes of event footage acquired with a 640 x 480 resolution neuromorphic sensor mounted on a quadrotor operating in an urban environment. Crowds of pedestrians, different types of vehicles, and street scenes featuring busy urban environments are captured at different elevations and illumination conditions. Manual bounding box annotations of vehicles and pedestrians contained in the recordings are provided at a frequency of 30 Hz, yielding 93,204 labels in total. Evaluation of the dataset's fidelity is performed through comprehensive ablation study for three Spiking Neural Networks (SNNs) and training ten Deep Neural Networks (DNNs) to validate the quality and reliability of both the dataset and corresponding annotations. All data and Python code to voxelize the data and subsequently train SNNs/DNNs has been open-sourced.
