The Turing Synthetic Radar Dataset: A dataset for pulse deinterleaving

Edward Gunn; Adam Hosford; Robert Jones; Leo Zeitler; Ian Groves; Victoria Nockles

The Turing Synthetic Radar Dataset: A dataset for pulse deinterleaving

Edward Gunn, Adam Hosford, Robert Jones, Leo Zeitler, Ian Groves, Victoria Nockles

TL;DR

This work introduces the Turing Synthetic Radar Dataset (TSRD), a large-scale, PDW-based benchmark for radar pulse deinterleaving designed to reflect realistic electronic warfare environments. It provides realistic data generated from a pipeline that simulates transmitter-receiver interactions across two receiver modes, with up to 110 emitters and ground-truth emitter labels to enable objective benchmarking. An accompanying Turing Deinterleaving Challenge standardizes evaluation using the median V-measure and offers tooling, data hosting, and leaderboards to foster reproducible research. By delivering a model-agnostic, ground-truth-rich dataset and an accessible challenge, the paper aims to accelerate progress in deinterleaving methods and related EW research.

Abstract

We present the Turing Synthetic Radar Dataset, a comprehensive dataset to serve both as a benchmark for radar pulse deinterleaving research and as an enabler of new research methods. The dataset addresses the critical problem of separating interleaved radar pulses from multiple unknown emitters for electronic warfare applications and signal intelligence. Our dataset contains a total of 6000 pulse trains over two receiver configurations, totalling to almost 3 billion pulses, featuring realistic scenarios with up to 110 emitters and significant parameter space overlap. To encourage dataset adoption and establish standardised evaluation procedures, we have launched an accompanying Turing Deinterleaving Challenge, for which models need to associate pulses in interleaved pulse trains to the correct emitter by clustering and maximising metrics such as the V-measure. The Turing Synthetic Radar Dataset is one of the first publicly available, comprehensively simulated pulse train datasets aimed to facilitate sophisticated model development in the electronic warfare community

The Turing Synthetic Radar Dataset: A dataset for pulse deinterleaving

TL;DR

Abstract

Paper Structure (10 sections, 6 figures, 3 tables)

This paper contains 10 sections, 6 figures, 3 tables.

Introduction
Generating a Realistic Radar Pulse Train Dataset
Dataset properties
The Turing Deinterleaving Challenge
Outlook & Conclusion
The Turing Deinterleaving Challenge
Varying sequence lengths
Feature extraction
Computational efficiency
Discussion

Figures (6)

Figure 1: The TSRD includes realistic transmitter-receiver behaviours. For each simulated pulse train, a static receiver (RX) detects pulses from multiple emitters at varying distances on a two dimensional plane, simulating realistic signal propagation effects, such as path loss and detected angle of arrival. Pulses sent from too far or at the wrong angle are not detected. Emitters operate on different modes, which includes the pulse frequency intervals, frequency modulations, and other advanced techniques.
Figure 2: Emitted pulses substantially overlap in the parameter space, rendering straightforward deinterleaving challenging. (A) and (B) exemplify two received pulse trains over ToA and amplitude in scan and stare mode, respectively, demonstrating that emitter signals are substantially superimposed. Simple deinterleaving is challenging, requiring sophisticated model development that makes use of clean data with ground truth labels (represented by the colours in the left panels).
Figure 3: PDWs mimic realistic radar transmitters. We simulated pulse transmission and detection in realistic environments characterised by 5-feature PDWs. Figure (A) and (B) demonstrate stare and scan receiver models over frequency, pulse width, AoA, and amplitude. The substantial overlap of radar pulses suggest that successful deinterleaving can only be achieved by leveraging temporal patterns over all parts of the PDWs.
Figure 4: Emitter-level statistics are well balanced over the entire dataset. (Top) The number of emitters is approximately uniformly distributed over all pulse trains, rendering some more complex than others. Emitter numbers over 80 eventually tail off. (Bottom) As expected the average number of pulses per emitter follows a Poisson-like distribution as expected from count data. Statistics were computed in scan mode.
Figure 5: Distributions for amplitude, frequency, and pulse width. PDWs are differently distributed across pulse trains, as demonstrated for amplitude, frequency, and pulse width (left to right) for scan mode (top) and stare model (bottom).
...and 1 more figures

The Turing Synthetic Radar Dataset: A dataset for pulse deinterleaving

TL;DR

Abstract

The Turing Synthetic Radar Dataset: A dataset for pulse deinterleaving

Authors

TL;DR

Abstract

Table of Contents

Figures (6)