ASTROFLOW: A Real-Time End-to-End Pipeline for Radio Single-Pulse Searches
Guanhong Lin, Dejia Zhou, Jianli Zhang, Jialang Ding, Fei Liu, Xiaoyun Ma, Yuan Liang, Ruan Duan, Liaoyuan Liu, Xuanyu Wang, Xiaohui Yan, Yingrou Zhan, Yuting Chu, Jing Qiao, Wei Wang, Jie Zhang, Zerui Wang, Meng Liu, Chenchen Miao, Menquan Liu, Meng Guo, Di Li, Pei Wang
TL;DR
Astroflow addresses the challenge of real-time single-pulse searches in high-rate radio surveys by delivering an end-to-end, GPU-accelerated pipeline that unifies RFI mitigation, subband dedispersion, image-based candidate detection, and an object-detection model. The system combines a CUDA-accelerated backend with a YOLOv11N detector to process DM–time images and produce timely candidate outputs, validated on FAST-FREX and QUEST data with substantial speedups over CPU baselines. Key contributions include a two-stage subband dedispersion algorithm, robust RFI filtering, and an efficient 512×512 DM–time visualization pipeline that enables real-time detection with high recall and low false positives. The work demonstrates practical scalability for next-generation facilities, offering a deployable framework that can be refined with additional data and models for large-scale transient discovery.
Abstract
Fast radio bursts (FRBs) are extremely bright, millisecond duration cosmic transients of unknown origin. The growing number of wide-field and high-time-resolution radio surveys, particularly with next-generation facilities such as the SKA and MeerKAT, will dramatically increase FRB discovery rates, but also produce data volumes that overwhelm conventional search pipelines. Real-time detection thus demands software that is both algorithmically robust and computationally efficient. We present Astroflow, an end-to-end, GPU-accelerated pipeline for single-pulse detection in radio time-frequency data. Built on a unified C++/CUDA core with a Python interface, Astroflow integrates RFI excision, incoherent dedispersion, dynamic-spectrum tiling, and a YOLO-based deep detector. Through vectorized memory access, shared-memory tiling, and OpenMP parallelism, it achieves 10x faster-than-real-time processing on consumer GPUs for a typical 150 s, 2048-channel observation, while preserving high sensitivity across a wide range of pulse widths and dispersion measures. These results establish the feasibility of a fully integrated, GPU-accelerated single-pulse search stack, capable of scaling to the data volumes expected from upcoming large-scale surveys. Astroflow offers a reusable and deployable solution for real-time transient discovery, and provides a framework that can be continuously refined with new data and models.
