The Turing Synthetic Radar Dataset: A dataset for pulse deinterleaving
Edward Gunn, Adam Hosford, Robert Jones, Leo Zeitler, Ian Groves, Victoria Nockles
TL;DR
This work introduces the Turing Synthetic Radar Dataset (TSRD), a large-scale, PDW-based benchmark for radar pulse deinterleaving designed to reflect realistic electronic warfare environments. It provides realistic data generated from a pipeline that simulates transmitter-receiver interactions across two receiver modes, with up to 110 emitters and ground-truth emitter labels to enable objective benchmarking. An accompanying Turing Deinterleaving Challenge standardizes evaluation using the median V-measure and offers tooling, data hosting, and leaderboards to foster reproducible research. By delivering a model-agnostic, ground-truth-rich dataset and an accessible challenge, the paper aims to accelerate progress in deinterleaving methods and related EW research.
Abstract
We present the Turing Synthetic Radar Dataset, a comprehensive dataset to serve both as a benchmark for radar pulse deinterleaving research and as an enabler of new research methods. The dataset addresses the critical problem of separating interleaved radar pulses from multiple unknown emitters for electronic warfare applications and signal intelligence. Our dataset contains a total of 6000 pulse trains over two receiver configurations, totalling to almost 3 billion pulses, featuring realistic scenarios with up to 110 emitters and significant parameter space overlap. To encourage dataset adoption and establish standardised evaluation procedures, we have launched an accompanying Turing Deinterleaving Challenge, for which models need to associate pulses in interleaved pulse trains to the correct emitter by clustering and maximising metrics such as the V-measure. The Turing Synthetic Radar Dataset is one of the first publicly available, comprehensively simulated pulse train datasets aimed to facilitate sophisticated model development in the electronic warfare community
