Classification of Radio Backgrounds at Cosmic Dawn and 21 cm Signal Confirmation Using Neural Networks

Sudipta Sikder; Anastasia Fialkov; Rennan Barkana

Classification of Radio Backgrounds at Cosmic Dawn and 21 cm Signal Confirmation Using Neural Networks

Sudipta Sikder, Anastasia Fialkov, Rennan Barkana

TL;DR

This work develops neural-network classifiers to identify the presence and type of an excess radio background in high-redshift 21 cm data, distinguishing between external and galactic origins and the standard CMB-only scenario. It demonstrates that, in ideal conditions, a power spectrum-based classifier achieves ~96% accuracy while a global-signal-based classifier reaches ~90%, with external backgrounds being most separable. When observational effects are included, accuracy remains reasonably high for SKA-like data (≈83% for PS, ≈79% for GS), and stronger global-signal measurements can further improve discrimination. Beyond classification, the authors build emulators mapping between the global signal and the 21 cm power spectrum, enabling cross-validation between radiometer and interferometer observations and providing a practical tool for consistency checks in forthcoming 21 cm cosmology experiments.

Abstract

Several ongoing and upcoming radio telescopes aim to detect either the global 21 cm signal or the 21 cm power spectrum. The extragalactic radio background, as detected by ARCADE-2 and LWA-1, suggests a strong radio background from cosmic dawn, which can significantly alter the cosmological 21 cm signal, enhancing both the global signal amplitude and the 21 cm power spectrum. In this paper, we employ an artificial neural network (ANN) to check if there is a radio excess over the cosmic microwave background in mock data, and if present, we classify its type into one of two categories, a background from high-redshift radio galaxies or a uniform exotic background from the early Universe. Based on clean data (without observational noise), the ANN can predict the background radiation type with $96\%$ accuracy for the power spectrum and $90\%$ for the global signal. Although observational noise reduces the accuracy, the results remain quite useful. We also apply ANNs to map the relation between the 21 cm power spectrum and the global signal. By reconstructing the global signal using the 21 cm power spectrum, an ANN can estimate the global signal range consistent with an observed power spectrum from SKA-like experiments. Conversely, we show that an ANN can reconstruct the 21 cm power spectrum over a wide range of redshifts and wavenumbers given the global signal over the same redshifts. Such trained networks can potentially serve as a valuable tool for cross-confirmation of the 21 cm signal.

Classification of Radio Backgrounds at Cosmic Dawn and 21 cm Signal Confirmation Using Neural Networks

TL;DR

Abstract

accuracy for the power spectrum and

for the global signal. Although observational noise reduces the accuracy, the results remain quite useful. We also apply ANNs to map the relation between the 21 cm power spectrum and the global signal. By reconstructing the global signal using the 21 cm power spectrum, an ANN can estimate the global signal range consistent with an observed power spectrum from SKA-like experiments. Conversely, we show that an ANN can reconstruct the 21 cm power spectrum over a wide range of redshifts and wavenumbers given the global signal over the same redshifts. Such trained networks can potentially serve as a valuable tool for cross-confirmation of the 21 cm signal.

Paper Structure (28 sections, 9 equations, 20 figures, 6 tables)

This paper contains 28 sections, 9 equations, 20 figures, 6 tables.

Introduction
Theoretical Background
Astrophysics at high redshift
21 cm signal
Excess radio background models
Methods
Methods to generate the data sets
Mock observational data sets
ANN architectures, hyper-parameter tuning, and data pre-processing
Classification of the radio background
Classification using the ideal data set
Classification using realistic mock data sets
The dependence of radio background classification on global signal amplitude
Classifications using REACH and SARAS 3 bands
$k$-fold cross-validation
...and 13 more sections

Figures (20)

Figure 1: Top panel: The global signal envelopes for three different classes of models included in our test data set for classification. Bottom panel: The envelopes for the 21 cm power spectrum at $k = 0.1$ Mpc$^{-1}$. Here we show the theoretical 21 cm signal, without any observational effects.
Figure 2: Left panel: The confusion matrix shows the performance of classifying various radio backgrounds given the 21 cm power spectrum without any observational effects. The model classes were 'No radio' (the standard astrophysical scenario without any excess radio background, i.e., CMB only), external radio models, and galactic radio models. The percentages add to 100 in each row, which corresponds to a particular true model class. Each column corresponds to a particular model class as predicted by the classifier. Right panel: A similar confusion matrix, but showing the performance of the classification procedure based on the global 21 cm signal (without any observational noise).
Figure 3: Confusion matrices, using the same setup and notation as in Fig. \ref{['fig:fig1']}, but here the data also include simulated observational noise. For the power spectrum we assume mock SKA 21 cm power spectra (which include thermal noise and other observational effects, see text). For the global signal, we include random Gaussian noise, for a relatively low noise level: $\mu = 0$ and $\sigma = 2.5$ mK; the results for additional noise levels ($\sigma = 17$ mK and 25 mK) are shown in Fig. \ref{['fig:confusion_matrix_gs_noise']}.
Figure 4: Classification accuracy as a function of the amplitude of the global signal absorption trough (without observational noise). The higher the bin number, the higher is the signal amplitude (see Table \ref{['tab:amplitudes_bins']} for the amplitude range of each bin).
Figure 5: The total variation distance (TVD) as a function of $z$ between the normalized PDFs of the global signals (without observational noise) of galactic and external models in the training data set. TVD quantifies the difference between two probability distributions (see text). The inset plots show the normalized PDFs of the sky averaged 21 cm brightness temperatures from galactic and external models (all models in the training sets) at $z = 17$ and $32$. At lower redshifts ($<27$), the value of the TVD is low compared to that at high redshifts, indicating that the difference between the normalized PDFs from galactic and external samples is much larger at high redshifts. Thus, the high-redshift behavior of external models is likely the reason behind the accurate classification of those models relative to the other two models, irrespective of the overall global signal amplitude.
...and 15 more figures

Classification of Radio Backgrounds at Cosmic Dawn and 21 cm Signal Confirmation Using Neural Networks

TL;DR

Abstract

Classification of Radio Backgrounds at Cosmic Dawn and 21 cm Signal Confirmation Using Neural Networks

Authors

TL;DR

Abstract

Table of Contents

Figures (20)