Table of Contents
Fetching ...

Machine learning classification of baseband data of CHIME FRBs

Mohanraj Madheshwaran, Tetsuya Hashimoto, Tomotsugu Goto, William J. Pearson, Murthadza Aznam, Simon C. -C. Ho, Vignesh V. V. Rao, Sridhar Gajendran

TL;DR

This work addresses misclassification of repeating versus non-repeating FRBs by applying an unsupervised learning pipeline to CHIME/FRB baseband data, exploiting high time resolution and expanded parameter measurements. Using 11 observational and model-dependent parameters, normalized and preprocessed, the authors perform a grid-search to optimize a UMAP+HDBSCAN clustering approach, achieving a mean F1 score of 0.78 in cross-validation. They identify 15 repeater candidates among 122 non-repeaters and reclassify 31 previously labeled repeater candidates as non-repeaters, with one candidate later confirmed as a repeater by CHIME/FRB; 14 of these overlap with prior work. The method demonstrates the potential to flag repeater candidates from high-resolution baseband data without costly long-term monitoring, aiding FRB progenitor studies and follow-up prioritization.

Abstract

Fast Radio Bursts (FRBs) are bright millisecond radio pulses. Their origin is still unknown in the field of astronomy. A notable distinction among FRBs is that some sources repeat, while others appear to be non-repeating events. Interestingly, repeating FRBs tend to exhibit broader temporal widths and narrower spectral bandwidths compared to non-repeat events, suggesting they may arise from different physical mechanisms. However, current radio telescopes have limited coverage and sensitivity, which hinders a complete survey with continuous long-term monitoring. This issue makes it difficult to confirm repeat activity and potentially leads to misclassification of repeaters as non-repeaters; these are referred to as repeater candidates. To address this, machine learning techniques have emerged as a useful tool for classifying distinct FRB types in previous studies. In this study, we utilize the CHIME/FRB baseband catalog with three orders of magnitude better time resolution than the intensity catalog. Measured fluences are available in the baseband catalog, while only upper limits are reported in the intensity catalog. We apply machine learning to the baseband catalog to evaluate classification outcomes. We identify 15 repeater candidates among 122 non-repeating FRBs in the baseband catalog. Additionally, our classification identifies 31 sources previously categorized as repeater candidates as non-repeaters, highlighting a significant difference from the prior work. Of these repeater candidates, 14 overlap with previous findings, while 1 is newly identified in this work. Notably, one of our candidates was confirmed as a repeater by CHIME/FRB. Follow-up observations for the 14 candidates are highly encouraged.

Machine learning classification of baseband data of CHIME FRBs

TL;DR

This work addresses misclassification of repeating versus non-repeating FRBs by applying an unsupervised learning pipeline to CHIME/FRB baseband data, exploiting high time resolution and expanded parameter measurements. Using 11 observational and model-dependent parameters, normalized and preprocessed, the authors perform a grid-search to optimize a UMAP+HDBSCAN clustering approach, achieving a mean F1 score of 0.78 in cross-validation. They identify 15 repeater candidates among 122 non-repeaters and reclassify 31 previously labeled repeater candidates as non-repeaters, with one candidate later confirmed as a repeater by CHIME/FRB; 14 of these overlap with prior work. The method demonstrates the potential to flag repeater candidates from high-resolution baseband data without costly long-term monitoring, aiding FRB progenitor studies and follow-up prioritization.

Abstract

Fast Radio Bursts (FRBs) are bright millisecond radio pulses. Their origin is still unknown in the field of astronomy. A notable distinction among FRBs is that some sources repeat, while others appear to be non-repeating events. Interestingly, repeating FRBs tend to exhibit broader temporal widths and narrower spectral bandwidths compared to non-repeat events, suggesting they may arise from different physical mechanisms. However, current radio telescopes have limited coverage and sensitivity, which hinders a complete survey with continuous long-term monitoring. This issue makes it difficult to confirm repeat activity and potentially leads to misclassification of repeaters as non-repeaters; these are referred to as repeater candidates. To address this, machine learning techniques have emerged as a useful tool for classifying distinct FRB types in previous studies. In this study, we utilize the CHIME/FRB baseband catalog with three orders of magnitude better time resolution than the intensity catalog. Measured fluences are available in the baseband catalog, while only upper limits are reported in the intensity catalog. We apply machine learning to the baseband catalog to evaluate classification outcomes. We identify 15 repeater candidates among 122 non-repeating FRBs in the baseband catalog. Additionally, our classification identifies 31 sources previously categorized as repeater candidates as non-repeaters, highlighting a significant difference from the prior work. Of these repeater candidates, 14 overlap with previous findings, while 1 is newly identified in this work. Notably, one of our candidates was confirmed as a repeater by CHIME/FRB. Follow-up observations for the 14 candidates are highly encouraged.

Paper Structure

This paper contains 14 sections, 9 equations, 10 figures, 1 table.

Figures (10)

  • Figure 1: The redshift difference between intensity and baseband catalogs, plotted against the baseband redshift as $(1 + z_{\rm baseband})$. The difference is calculated by $\Delta z = z_{\rm intensity} - z_{\rm baseband}$. Highlighted FRBs exhibit significant deviations from the line of equality (horizontal dashed line), suggesting inconsistencies in their redshifts and potentially low positional accuracy.
  • Figure 2: Distributions of both observed and model-dependent parameters for repeaters (red) and non-repeaters (blue), plotted after the sample selection described in Section \ref{['sample']}.
  • Figure 3: The silhouette score result of grid search. The $min\_cluster\_size=5$ reaches the maximum peak at $n\_neighbor=3$. The figure is shown with $min\_dist=0.01$ and $min\_samples=4$.
  • Figure 4: The daives-bouldin score result of grid search. The $min\_cluster\_size=5$ reaches the lower peak at $n\_neighbors=3$. The figure is shown with $min\_dist=0.01$ and $min\_samples=4$.
  • Figure 5: F1 Scores of the six-fold samples used for the cross-validation. The mean F1 Score is presented by a red-dotted horizontal line.
  • ...and 5 more figures