A signal dedispersion algorithm for imaging-based transient searches

Cristian Di Pietrantonio; Marcin Sokolowski; Christopher Harris; Danny C. Price; Randall Wayth

A signal dedispersion algorithm for imaging-based transient searches

Cristian Di Pietrantonio, Marcin Sokolowski, Christopher Harris, Danny C. Price, Randall Wayth

TL;DR

Streaming high Time-Resolution Imaging DEdispersion (STRIDE), a novel dedispersion algorithm to generate per-pixel dedispersed time series from high time and frequency resolution interferometric images, which is the first dedispersion algorithm to partition a dispersive sweep over the time dimension, in addition to frequency.

Abstract

Dedispersion is the computational process of correcting for the frequency-dependent time delay affecting a radio signal that propagates through the interstellar and intergalactic media. It is a crucial component of transient search pipelines that maximises the signal-to-noise ratio, especially when targeting highly dispersed signals: for instance, pulsar emissions making their way through a dense cloud of ionised gas, and fast radio bursts travelling cosmological distances. This paper introduces Streaming high Time-Resolution Imaging DEdispersion (STRIDE), a novel dedispersion algorithm to generate per-pixel dedispersed time series from high time and frequency resolution interferometric images. Unlike straightforward approaches to image dedispersion, STRIDE does not involve expensive manipulation of the input data layout, such as explicitly building dynamic spectra or shifting images. Furthermore, it is the first dedispersion algorithm to partition a dispersive sweep over the time dimension, in addition to frequency. As a consequence, images corresponding to the entire time span of the target dispersive delay are not required all at once. Instead, the algorithm works with an arbitrarily-sized subset of images at a time, adopting an incremental, streaming-based approach to dedispersion. In evaluating STRIDE on the presented test case, it is shown that the minimum memory requirement is reduced by 97.9%, going from 684.5 GB to 14.4 GB. As current and future generations of widefield interferometers increasingly turn to imaging techniques for detection and localisation of radio transients, STRIDE positions itself as a strong alternative to traditional dedispersion methodologies. It arguably is the only viable option for imaging-based searches with low-frequency instruments such as the Murchison Widefield Array (MWA) and low-frequency Square Kilometre Array (SKA-Low).

A signal dedispersion algorithm for imaging-based transient searches

TL;DR

Abstract

Paper Structure (14 sections, 1 theorem, 21 equations, 7 figures, 2 tables, 4 algorithms)

This paper contains 14 sections, 1 theorem, 21 equations, 7 figures, 2 tables, 4 algorithms.

Introduction
Related work
Incoherent dedispersion
Motivation
Mathematical framework
The STRIDE algorithm
Ring buffer strategy
Generalisation to multiple pixels and DMs
Parallelisation strategy
Experimental results
Conclusion
Proofs
Author ORCID Identifiers
List of symbols

Key Result

Theorem 1

There are $\delta (f) - 1$ side sweeps for each frequency channel $f$in $D_{i,j}$.

Figures (7)

Figure 1: A giant pulse from the Crab pulsar captured with the MWA. A dynamic spectrum plots the observed signal intensity as a function of frequency and arrival time. Astronomical signals manifest themselves as quadratic sweeps because of the dispersive delay caused by the interstellar and intergalactic media. The Crab pulsar has a DM of approximately 57 pc cm$^{-3}$. The dispersive delay across the 30.72 MHz band centred at 154.25 MHz is 4 seconds. The giant pulse and the fainter ones pictured in the dynamic spectrum were detected during a non-targeted transient search test using the algorithm presented in this work and implemented in the BLINK imaging pipeline DiPietrantonio2025GPUImager.
Figure 2: Beamforming and imaging output data layouts. Beamforming produces a dynamic spectrum of dimensions $F \times T$ for each sky pointing. Time and frequency information is contiguously stored in memory, whereas dynamic spectra do not need to be, and typically are not, held in memory at the same time (left diagram). A high time and frequency resolution imaging pipeline generates images of dimensions $X \times Y$, one for each fine channel and time bin. A pixel value within an image encodes the signal intensity at the associated frequency channel, time bin, and direction in the sky (right diagram). The number of pixels does not place a memory requirement on the processing of dynamic spectra produced through beamforming because they are processed independently. Conversely, image size poses a limitation on how many images can be simultaneously held in memory, and hence the number $n_t$ of time bins and the number $n_f$ of frequency channels that are available at any given time for dedispersion. STRIDE partitions images in image sets containing $n_t n_f$ images each and that can entirely reside in memory.
Figure 3: Visual representation of a section. Only a 2D portion of size $n_t \times n_f$ of a dynamic spectrum is available at any given time. It covers a subset $\mathcal{F}_i$ of the original interval $\mathcal{F}$ of frequency channels, and a range $\mathcal{T}_j$, $\mathcal{T}_j \subseteq \mathcal{T}$, of time bins. Sweeps cross the section along paths defined by the dispersive delay and accumulate intensity values associated with those. In this example, each frequency channel is characterised by a discrete time delay of 3 time bins. That is, $\delta(f) = 3,\ \forall f$. The two side sweeps of channel $in_f$ are depicted with an indigo diamond and orange triangle patterns. These have entered the channel in the time adjacent section $D_{i,j-1}$ and crossed the boundary with the current one. A top sweep, represented with a green parallelogram pattern, crosses the top right corner of the section. The sweep enters the top channel in the time interval encompassed by the section, traverses it entirely, and enters channel $in_f -1$ before leaving the section. It will become a side sweep of section $D_{i, j + 1}$.
Figure 4: Example execution of the dedispersion algorithm. The full dynamic spectrum $D$, spanning $T = 16$ time bins and $F = 6$ frequency channels, is partitioned in 8 sections of dimensions $n_t = 4$, $n_f = 3$. Aligned on top of $D$ is the array holding the accumulated intensities across sweep paths for a fixed DM value and for all start time bins. To simplify the example, the discrete time delay is set to be the same for each frequency channel and it is equal to 3. That is, $\delta(f) = 3,\forall f\ f \in \{1, \ldots, 6\}$. Then, the cumulative delay $g(f)$ can be simplified as $g(f) = 2(6 - f)$. Algorithm \ref{['algo:dedisp']} is executed for each section of the dynamic spectrum. In this example, there are $n_t = 4$ top sweeps and $3(\delta(f) - 1) = 6$ side sweeps, two for each channel, in each section. Firstly, all top sweeps are computed. Three of these, $s(6, 2, 0)$, $s(6, 3, 0)$, and $s(6, 4, 0)$, are shown entering Section $D_{2,1}$ and are depicted with a blue circle, an indigo diamond, and an orange triangle pattern, respectively. The compute_partial_sweep routine follows their path through $D_{2,1}$, accumulating the associated intensity values, until they cross the section boundary. Resulting partial intensities are added to the corresponding entries in the time series array with indexes $T(6, 2, 0) = 2$, $T(6, 3, 0) = 3$, and $T(6, 4, 0) = 4$ . Sections whose time interval index is 1, $D_{1,1}$ and $D_{2,1}$ in this case, do not have side sweeps that can be completed. For instance, the red square sweep $s(4, 1, 1)$ traversing sections $D_{2,1}$, $D_{1,1}$, and $D_{1,2}$ entered the band at a time bin $T(4, 1, 1) = 1 - 1 - g(4) = -4$ not covered by $D$. Hence, it does not have an entry in the dedispersed time series array and its accumulated intensity is discarded. Actionable side sweeps are encountered when the algorithm processes section $D_{2,2}$. The indigo diamond sweep and the orange triangle one enter section $D_{2,2}$ as side sweeps of channel 6, and are now identified as $s(6, 5, 2)$ and $s(6, 5, 1)$. On the other hand, the circle sweep traverses channel 6 entirely in the time-adjacent section $D_{2,1}$ and enters section $D_{2,2}$ through channel 5 as $s(5, 5, 1)$. Their respective start times are $T(6, 5, 2) = 5 - 2 - g(6) = 3$, $T(6, 5, 1) = 4$, and $T(5, 5, 1) = 2$. Intensity contributions of section $D_{2,2}$ to those sweeps are added to the corresponding array entries where contributions from previous sections are also found. The sweep length $L$ (Equation \ref{['eq:sweep_length']}) across the entire spectrum $D$ is defined as $L = g(1) + \delta(1) = 10 + 3 = 13$. The circle, diamond, and triangle sweeps reach the bottom channel within the pictured time frame. The sweep $s(6, 13, 0)$, drawn with a green parallelogram pattern, does not, and requires additional future sections of the dynamic spectrum to accumulate the intensity values over its full path.
Figure 5: Example execution of the dedispersion algorithm using a ring buffer. Depicted in this figure is the state of algorithm at the end of the first three dedispersion cycles. A cycle ends when the buffer $S$ becomes full and ready for downstream processing. During a dedispersion cycle, the transient_search algorithm iterates over sections of a dynamic spectrum as they are produced by an upstream process. In this example there are 4 time bins in a time interval covered by a section. The full bandwidth is divided into 6 frequency channels, 3 per section. To ease the illustration of the example, the dispersive delay for each channel is chosen to be $2$ time bins, resulting into a sweep length $L= 7$. The number $B$ of additional slots in $S$ is arbitrarily set to 13. The resulting size of $S$ is $N = L + B = 20$. Slot indexes are displayed on top of the array. Every time the ring buffer becomes full, the first $h = r_c - L + 1$ elements starting from position $r_s$, coloured in green, form a short time series that is ready to be searched for transient signals. Sections contributing to the dedispersed time series are shown below the ring buffer, aligned to $S$ according to the time bins they represent. A green cell in the ring buffer stores the accumulated intensity value of a complete sweep. The path of such sweeps is also coloured in green throughout the crossed sections. The yellow colour is associated with incomplete sweeps that occupy the latest entries of the buffer. Gray cells are not associated to any sweep due to $n_t$ not being always a multiple of $N - r_c$ at the beginning of a cycle.
...and 2 more figures

Theorems & Definitions (2)

Theorem 1
proof

A signal dedispersion algorithm for imaging-based transient searches

TL;DR

Abstract

A signal dedispersion algorithm for imaging-based transient searches

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (2)