Table of Contents
Fetching ...

The Bayesian Global Sky Model (B-GSM): Validation of a Data Driven Bayesian Simultaneous Component Separation and Calibration Algorithm for EoR Foreground Modelling

George Carter, Will Handley, Mark Ashdown, Nima Razavi-Ghods

Abstract

We introduce the Bayesian Global Sky Model (B-GSM), a novel data-driven Bayesian approach to modelling radio foregrounds at frequencies <400~MHz. B-GSM aims to address the limitations of previous models by incorporating robust error quantification and calibration. Using nested sampling, we compute Bayesian evidence and posterior distributions for the spectral behaviour and spatial amplitudes of diffuse emission components. Bayesian model comparison is used to determine the optimal number of emission components and their spectral parametrisation. Posterior sky predictions are conditioned on both diffuse emission and absolute temperature datasets, enabling simultaneous component separation and calibration. B-GSM is validated against a synthetic dataset designed to mimic the partial sky coverage, thermal noise, and calibration uncertainties present in real observations of the diffuse sky at low frequencies. B-GSM correctly identifies a model parametrisation with two emission components featuring curved power-law spectra. The posterior sky predictions agree with the true synthetic sky within statistical uncertainty. We find that the root-mean-square (RMS) residuals between the true and posterior predictions for the sky temperature as a function of LST are significantly reduced, when compared to the uncalibrated dataset. This indicates that B-GSM is able to correctly calibrate its posterior sky prediction to the independent absolute temperature dataset. We find that while the spectral parameters and component amplitudes exhibit some sensitivity to prior assumptions, the posterior sky predictions remain robust across a selection of different priors. This is the first of two papers, and is focused on validation of B-GSMs Bayesian framework, the second paper will present results of deployment on real data and introduce the low-frequency sky model which will be available for public download.

The Bayesian Global Sky Model (B-GSM): Validation of a Data Driven Bayesian Simultaneous Component Separation and Calibration Algorithm for EoR Foreground Modelling

Abstract

We introduce the Bayesian Global Sky Model (B-GSM), a novel data-driven Bayesian approach to modelling radio foregrounds at frequencies <400~MHz. B-GSM aims to address the limitations of previous models by incorporating robust error quantification and calibration. Using nested sampling, we compute Bayesian evidence and posterior distributions for the spectral behaviour and spatial amplitudes of diffuse emission components. Bayesian model comparison is used to determine the optimal number of emission components and their spectral parametrisation. Posterior sky predictions are conditioned on both diffuse emission and absolute temperature datasets, enabling simultaneous component separation and calibration. B-GSM is validated against a synthetic dataset designed to mimic the partial sky coverage, thermal noise, and calibration uncertainties present in real observations of the diffuse sky at low frequencies. B-GSM correctly identifies a model parametrisation with two emission components featuring curved power-law spectra. The posterior sky predictions agree with the true synthetic sky within statistical uncertainty. We find that the root-mean-square (RMS) residuals between the true and posterior predictions for the sky temperature as a function of LST are significantly reduced, when compared to the uncalibrated dataset. This indicates that B-GSM is able to correctly calibrate its posterior sky prediction to the independent absolute temperature dataset. We find that while the spectral parameters and component amplitudes exhibit some sensitivity to prior assumptions, the posterior sky predictions remain robust across a selection of different priors. This is the first of two papers, and is focused on validation of B-GSMs Bayesian framework, the second paper will present results of deployment on real data and introduce the low-frequency sky model which will be available for public download.
Paper Structure (23 sections, 41 equations, 13 figures, 2 tables)

This paper contains 23 sections, 41 equations, 13 figures, 2 tables.

Figures (13)

  • Figure 1: The two components and their spectra used to generate the synthetic dataset. The two spectra are power-laws, component 1 has $\beta_1=-2.6$$\gamma_1=0$, and component 2 has $\beta_2=-2.1$$\gamma_2=-0.5$. Both component amplitude maps are shown in units of kelvin on a log scale, they are HEALPix maps with $N_\mathrm{side}=32$.
  • Figure 3: Illustration of the production of the synthetic absolute temperature data for 45 MHz. The left-hand panel shows the synthetic full sky map at 45 MHz with no introduction of calibration uncertainty. The centre panel shows the beam model (at LST=18 hours) that this synthetic sky is convolved with, note convolution is performed with beams at every 20 minutes of LST over the range 0 hours to 24 hours. The right-hand panel shows the resulting synthetic $T$ vs LST curve.
  • Figure 4: Plot of the values found for the Bayesian evidence for each of the tested candidate models. Shown in green are models with component spectra that are power-laws with no curvature ($\gamma_c=0$$\forall$$c$). Shown in red are models where all component spectra are curved power-laws. We see that the Bayesian evidence is highest for a 2 component model with curved power-law spectra. We strongly reject the incorrect models, and select for the correct number of components and spectral model type.
  • Figure 5: Corner plot of the marginal posterior for the highest evidence model, plotted using Anesthetic anesthetic. The contours and points show the marginal posterior distributions for each model parameter. The red dashed lines show the true value (used to generate the synthetic dataset) for each parameter.
  • Figure 6: Functional posterior plot of component spectra, produced using fgivenx fgivenx. The red dashed lines show the true spectra, used to generate the synthetic dataset. The black lines show posterior spectra, which cluster around the true component spectra, indicating an excellent match to the true spectra across the full frequency range.
  • ...and 8 more figures