Curation and Dissemination of Complex Multi-modal Data Sets for Radiation Detection, Localization, and Tracking
Nicolas Abgrall, Mark S. Bandstra, Reynold J. Cooper, Marco Salathe, Brian J. Quiter, Rajesh Sankaran, Yongho Kim, Sean Shahkarami
TL;DR
This work presents PANDAWN, a 13-node urban, multi-modal sensing network for radiological detection, localization, and tracking. It describes an automated labeling and curation pipeline that fuses radiation, environmental, and contextual data using edge computing and a cloud-backed data store, enabling scalable, ground-truth-rich datasets. The approach combines continuous edge calibration, NMF-based background modeling, and computer-vision–based contextual labeling (YOLOv10 and Norfair) with triggered acquisition for event-driven data capture, and demonstrates several studies that leverage the curated data for isotope identification, background adaptation, and network-wide data fusion. The datasets and labeled resources aim to accelerate development of radiological/nuclear analytics and broader nonproliferation insights, with data sharing planned through public repositories.
Abstract
The PANDAWN sensor network in Chicago, IL, is a state-of-the-art test-bed for networked, multi-modal sensing. It integrates AI/data science methods into its operation, from data acquisition to automated data labeling and curation workflows. The curation and dissemination of diverse multi-modal data sets will enable the development of new radiological/nuclear (R/N) detection, localization, and tracking algorithms, and methods relevant across the nonproliferation mission space. This paper first introduces the PANDAWN sensor network and the features that make it stand out from previous multi-modal data acquisition efforts. We then review the various data streams acquired on the PANDAWN nodes, and present the implementation of an automated data curation pipeline that includes the labeling of radiation and contextual data streams. We finally provide a short overview of different studies that leveraged the curated data sets.
