Bayesian model comparison and validation with Gaussian Process Regression for interferometric 21-cm signal recovery
Yuchen Liu, Eloy de Lera Acedo, Peter Sims
TL;DR
This paper tackles the challenge of extracting the faint 21-cm signal from cosmic dawn and reionization data in the presence of bright foregrounds by proposing a Bayesian model-comparison framework for Gaussian Process Regression (GPR). It introduces a variational autoencoder (VAE) kernel to capture realistic 21-cm covariance and evaluates five GPR models against realistic SKA-Low-like simulations using nested sampling to obtain global evidences and posterior distributions. A novel Bayesian null-test (BaNTER) validates model reliability by testing against data lacking a cosmological signal. The results show that wedge-parametrized models with noise scaling (notably αNoise) provide the strongest evidence and most accurate, unbiased 21-cm recovery, while some alternative models risk biased reconstructions, underscoring the need for rigorous model selection and validation in future SKA analyses.
Abstract
The 21-cm signal from neutral hydrogen is anticipated to reveal critical insights into the formation of early cosmic structures during the Cosmic Dawn and the subsequent Epoch of Reionization. However, the intrinsic faintness of the signal, as opposed to astrophysical foregrounds, poses a formidable challenge for its detection. Motivated by the recent success of machine learning based Gaussian Process Regression (GPR) methods in LOFAR and NenuFAR observations, we perform a Bayesian comparison among five GPR models to account for the simulated 4-hour tracking observations with the SKA-Low telescope. The simulated sky is convolved with the instrumental beam response and includes realistic radio sources and thermal noise from 122 to 134 MHz. A Bayesian model evaluation framework is applied to five GPR models to discern the most effective modelling strategy and determine the optimal model parameters. The GPR model with wedge parametrization ($\textit{Wedge}$) and its extension ($α\textit{Noise}$) with noise scaling achieve the highest Bayesian evidence of the observed data and the least biased 21-cm power spectrum recovery. The $α\textit{Noise}$ and $\textit{Wedge}$ models also forecast the best local power-spectrum recovery, demonstrating fractional differences of $-0.14\%$ and $0.47\%$ respectively, compared to the injected 21-cm power at $k = 0.32\ \mathrm{h\ cMpc}^{-1}$. We additionally perform Bayesian null tests to validate the five models, finding that the two optimal models also pass with the remaining three models yielding spurious detections in data containing no 21-cm signal.
