Table of Contents
Fetching ...

Improving pulsar search efficiency in next-generation pulsar surveys with artificial intelligence

Qiuyang Fu, Mengyao Xue, Weiwei Zhu, N. D. R. Bhat, Kaichao Wu, Zihan Zhang, B. W. Meyers, Chia Min Tan, Youling Yue, Jiarui Niu, Lingqi Meng, Ziwei Wu, Ziyao Fang, Yukai Zhou, Jiawei Jin

TL;DR

The paper tackles the bottleneck of folding massive data sets in next-generation pulsar surveys by introducing an AI-accelerated pipeline that pre-filters snapshot candidates using time-domain features. It combines a denoising autoencoder for robust snapshot representation with a hybrid SE-ResNet and CBAM-based classifier, operating on multi-scale time-domain inputs to reduce full-data foldings dramatically. Extensive testing across FAST, Parkes, Arecibo, MWA-SMART, and simulated data demonstrates high accuracy and recall (≈0.98) and speed-ups ranging from 10x to 60x, with notable generalization to new telescopes and conditions. The work provides a scalable path for integrating AI into SKA-era pulsar surveys, enabling efficient candidate screening and fast, robust pulsar discovery across diverse observing environments.

Abstract

Pulsar searching with next-generation radio telescopes requires efficiently sifting through millions of candidates generated by search pipelines to identify the most promising ones. This challenge has motivated the utilization of Artificial Intelligence (AI)-based tools. In this work, we explore an optimized pulsar search pipeline that utilizes deep learning to sift ``snapshot'' candidates generated by folding de-dispersed time series data. This approach significantly accelerates the search process by reducing the time spent on the folding step. We also developed a script to generate simulated pulsars for benchmarking and model fine-tuning. The benchmark analysis used the NGC 5904 globular cluster data and simulated pulsar data, showing that our pipeline reduces candidate folding time by a factor of $\sim$10 and achieves 100% recall by recovering all known detectable pulsars in the restricted parameter space. We also tested the speed-up using data of known pulsars from a single observation in the Southern-sky MWA Rapid Two-metre (SMART) survey, achieving a conservatively estimated speed-up factor of 60 in the folding step over a large parameter space. We tested the model's ability to classify pulsar candidates using real data collected from the FAST, GBT, MWA, Arecibo, and Parkes, demonstrating that our method can be generalized to different telescopes. The results show that the optimized pipeline identifies pulsars with an accuracy of 0.983 and a recall of 0.9844 on the real dataset. This approach can be used to improve the processing efficiency for the SMART and is also relevant for future SKA pulsar surveys.

Improving pulsar search efficiency in next-generation pulsar surveys with artificial intelligence

TL;DR

The paper tackles the bottleneck of folding massive data sets in next-generation pulsar surveys by introducing an AI-accelerated pipeline that pre-filters snapshot candidates using time-domain features. It combines a denoising autoencoder for robust snapshot representation with a hybrid SE-ResNet and CBAM-based classifier, operating on multi-scale time-domain inputs to reduce full-data foldings dramatically. Extensive testing across FAST, Parkes, Arecibo, MWA-SMART, and simulated data demonstrates high accuracy and recall (≈0.98) and speed-ups ranging from 10x to 60x, with notable generalization to new telescopes and conditions. The work provides a scalable path for integrating AI into SKA-era pulsar surveys, enabling efficient candidate screening and fast, robust pulsar discovery across diverse observing environments.

Abstract

Pulsar searching with next-generation radio telescopes requires efficiently sifting through millions of candidates generated by search pipelines to identify the most promising ones. This challenge has motivated the utilization of Artificial Intelligence (AI)-based tools. In this work, we explore an optimized pulsar search pipeline that utilizes deep learning to sift ``snapshot'' candidates generated by folding de-dispersed time series data. This approach significantly accelerates the search process by reducing the time spent on the folding step. We also developed a script to generate simulated pulsars for benchmarking and model fine-tuning. The benchmark analysis used the NGC 5904 globular cluster data and simulated pulsar data, showing that our pipeline reduces candidate folding time by a factor of 10 and achieves 100% recall by recovering all known detectable pulsars in the restricted parameter space. We also tested the speed-up using data of known pulsars from a single observation in the Southern-sky MWA Rapid Two-metre (SMART) survey, achieving a conservatively estimated speed-up factor of 60 in the folding step over a large parameter space. We tested the model's ability to classify pulsar candidates using real data collected from the FAST, GBT, MWA, Arecibo, and Parkes, demonstrating that our method can be generalized to different telescopes. The results show that the optimized pipeline identifies pulsars with an accuracy of 0.983 and a recall of 0.9844 on the real dataset. This approach can be used to improve the processing efficiency for the SMART and is also relevant for future SKA pulsar surveys.

Paper Structure

This paper contains 16 sections, 5 equations, 8 figures, 8 tables.

Figures (8)

  • Figure 1: In the FFT-based pulsar search, after trying all possible DM values to remove dispersion delay and applying FFT to get potential periods cooley1965algorithm, we obtain a series of candidate DM-period combinations. These candidates can then be folded using the full multi-frequency data or the corresponding de-dispersed time series data.
  • Figure 2: We adopt a three-stage approach to reduce folding time. First, we fold the time series data to get initial pulsar candidates. Next, we filter these candidates using an AI classifier based on time-domain features. Finally, we fold the remaining candidates using the full data to confirm their authenticity with additional frequency information.
  • Figure 3: The candidate plot exhibits two features: feature "a" corresponds to the average pulse profile obtained by integrating the folded data along the time axis, and feature "b" is a time-phase diagram, where the x-axis shows two phases and the y-axis indicates the integration time.
  • Figure 4: Top row: average pulse profiles (feature "a"), where the left panel shows a real pulsar profile characterized by sharp peaks at certain phases due to inherent periodicity and the averaging out of noise, and the right panel shows the profile of a non-pulsar candidate. Bottom row: time-phase diagrams (feature "b"), where the left panel illustrates a real pulsar, characterized by periodic signals consistently aligned at the same phase over a time span, whereas the right panel depicts a non-pulsar candidate lacking such phase coherence.
  • Figure 5: We input the time-phase features into a Denoising Autoencoder (DAE). Each instance's x-axis represents the phase bins, while the y-axis corresponds to the time intervals. The example feature arrays are resized to dimensions of 64 by 64. The first four instances are positive samples, and the next four are negative. Using the denoising mechanism of the DAE, it effectively suppresses noise and highlights features.
  • ...and 3 more figures