Table of Contents
Fetching ...

Robust CMB B-mode analysis with Needlet-ILC and simulation-based inference

Adriaan J. Duivenvoorden, Kristen Surrao, Adrian E. Bayer, Alexandre E. Adler, Nadia Dachlythra, Susanna Azzoni, J. Colin Hill

TL;DR

This paper presents a simulation-based inference framework that couples Needlet-ILC (NILC) compression with cross-spectral statistics and conditional normalizing flows to robustly infer the tensor-to-scalar ratio $r$ from large-scale CMB polarization data. By compressing multi-frequency maps into a 165-element vector that includes CMB, dust, synchrotron, and first-order foreground SED-moment components, and by training a neural posterior estimator on simulated data, the method remains unbiased even under strong foreground anisotropy modeled by PySM. Compared with traditional multi-frequency likelihoods, SBI with joint NILC compression achieves higher robustness and improved constraints on $r$, especially when foreground complexity is significant, and demonstrates feasibility for a ground-based experiment akin to the Simons Observatory. The framework supports marginalization over a wide range of nuisance parameters and instrumental systematics through forward simulations, enabling more realistic foreground modeling and uncertainty quantification for current and future CMB polarization analyses.

Abstract

We explore a novel analysis framework for parameter inference with large-scale CMB polarization data. Our method uses simulation-based inference combined with the needlet internal linear combination (NILC) algorithm and cross-correlation-based statistics to compress the data into a vector that is robust to model misspecification and small enough to be amenable to neural posterior estimation with normalizing flows. By leveraging this compressed data representation, our method enables the robust use of the anisotropic and non-Gaussian information in the foreground fields to more accurately separate the CMB polarization signal from these contaminants. Using an idealized ground-based experimental setup inspired by the Simons Observatory Small Aperture Telescopes, we demonstrate improved statistical constraining power for the tensor-to-scalar ratio $r$ compared to the (constrained) NILC algorithm and improved robustness to complex foregrounds compared to other techniques in the literature. Trained on a relatively simple semi-analytical foreground model, the method yields unbiased $r$ results across a range of PySM Galactic foreground simulations, including the high-complexity d12 model, for which we obtain $r=(1.09 \pm 0.27)\cdot 10^{-2}$ for input $r=0.01$ and sky fraction $f_{\mathrm{sky}} = 0.21$. We thus demonstrate the feasibility and advantages of a complete, maps-to-parameters, simulation-based analysis of large-scale CMB polarization for current ground-based observatories.

Robust CMB B-mode analysis with Needlet-ILC and simulation-based inference

TL;DR

This paper presents a simulation-based inference framework that couples Needlet-ILC (NILC) compression with cross-spectral statistics and conditional normalizing flows to robustly infer the tensor-to-scalar ratio from large-scale CMB polarization data. By compressing multi-frequency maps into a 165-element vector that includes CMB, dust, synchrotron, and first-order foreground SED-moment components, and by training a neural posterior estimator on simulated data, the method remains unbiased even under strong foreground anisotropy modeled by PySM. Compared with traditional multi-frequency likelihoods, SBI with joint NILC compression achieves higher robustness and improved constraints on , especially when foreground complexity is significant, and demonstrates feasibility for a ground-based experiment akin to the Simons Observatory. The framework supports marginalization over a wide range of nuisance parameters and instrumental systematics through forward simulations, enabling more realistic foreground modeling and uncertainty quantification for current and future CMB polarization analyses.

Abstract

We explore a novel analysis framework for parameter inference with large-scale CMB polarization data. Our method uses simulation-based inference combined with the needlet internal linear combination (NILC) algorithm and cross-correlation-based statistics to compress the data into a vector that is robust to model misspecification and small enough to be amenable to neural posterior estimation with normalizing flows. By leveraging this compressed data representation, our method enables the robust use of the anisotropic and non-Gaussian information in the foreground fields to more accurately separate the CMB polarization signal from these contaminants. Using an idealized ground-based experimental setup inspired by the Simons Observatory Small Aperture Telescopes, we demonstrate improved statistical constraining power for the tensor-to-scalar ratio compared to the (constrained) NILC algorithm and improved robustness to complex foregrounds compared to other techniques in the literature. Trained on a relatively simple semi-analytical foreground model, the method yields unbiased results across a range of PySM Galactic foreground simulations, including the high-complexity d12 model, for which we obtain for input and sky fraction . We thus demonstrate the feasibility and advantages of a complete, maps-to-parameters, simulation-based analysis of large-scale CMB polarization for current ground-based observatories.

Paper Structure

This paper contains 22 sections, 38 equations, 16 figures.

Figures (16)

  • Figure 1: TARP coverage tests. Top four panels: the joint coverage for different amounts of training simulations. Bottom two panels: the TARP test applied to the marginal posteriors of $r$ and $A_{\mathrm{lens}}$. The contours show $\pm2$ standard deviations of the test statistic estimated using bootstrap resampling.
  • Figure 2: Comparison of posteriors from the standard multi-frequency power spectrum likelihood (blue) and our new joint-NILC SBI technique (orange). Both use the same data, generated without spatial variation in the spectral index fields or other sources of statistical anisotropy, making the likelihood approach statistically optimal; this comparison thus validates that our method reproduces the correct results in this simple setting.
  • Figure 3: Comparison between inferred $r$ values for our new joint NILC SBI method and the standard multi-frequency power-spectrum-likelihood-based method (i.e. without the moment marginalization of azzoni_2021). The error bars show the highest density interval that contains 68% of the probability. The points are ordered based on the mean standard deviation of the $\beta_{\mathrm{d}}$ and $\beta_{\mathrm{s}}$ fields that is inferred by the SBI method. We find this to be a decent indicator of both the bias of the likelihood method and the posterior width of the SBI method. The test data for this plot are generated for fixed $r=0.01$. The SBI results produce unbiased results regardless of spectral index power, while the likelihood-based method is biased over the entire range and becomes particularly biased as the inferred spectral index power increases. Solid vertical gray lines connect SBI and likelihood-based points that are generated from the same input data. Dashed solid lines show the standard deviation of $\beta_{\mathrm{d}}$ and $\beta_{\mathrm{s}}$ for three different combinations of PySM models evaluated over a 21% sky mask, see Sec. \ref{['sec:pysm_results']}. For the $\texttt{d12}$ model, $\sigma_{\beta_\mathrm{d}}$ has been determined from a per-pixel fit of Eq. \ref{['eq:dust_sed']} and should be considered a conservatively low estimate. To aid the visualization, we have thinned the distribution of points for this plot by randomly selecting ten points in each of 15 logarithmic bins. In addition, likelihood points at high values of $(\hat{\sigma}_{\beta_\mathrm{d}}^2 + \hat{\sigma}_{\beta_\mathrm{s}}^2)^{1/2}$ for which the power-spectrum-based MCMC sampling fails due to multimodality are not displayed.
  • Figure 4: Standard deviation (half of width of 68% highest density interval) of $r$ as function of the mean inferred standard deviation of the $\beta_{\mathrm{d}}$ and $\beta_{\mathrm{s}}$ fields. Colors indicate the true value of the standard deviation. The horizontal gray error bars denote the 68% highest density interval of the posterior on $(\sigma_{\beta_{\mathrm{d}}}^2 + \sigma_{\beta_{\mathrm{s}}}^2)^{1/2}$. Each dot summarizes the posterior for one of the simulated datasets described in Sec. \ref{['sec:results_joint_nilc']} and the panels show the different SBI setups described in Table \ref{['table:nilc_types']}. It is clear how the inclusion of foreground maps improves the constraining power of the joint NILC method compared to the standard NILC and constrained NILC setups. The inferred power of the spectral index fields appears to be a good indicator for the uncertainty on $r$.
  • Figure 5: Posterior distributions for different PySM foreground simulations. The CMB and noise realizations are shared between all cases. The true values of the $r$ and $A_{\mathrm{lens}}$ parameters are denoted with dashed gray lines. It can be seen that the same conditional normalizing flow, trained on the relatively simple simulations used in the work, produces an unbiased posterior for the $r$ and $A_{\mathrm{lens}}$ parameters for all types of foreground models, demonstrating the robustness provided by the NILC-based compression strategy. The marginal posteriors of $B_{\mathrm{d}}$ and $\gamma_{\mathrm{d}}$ indicate that the method is able to infer that the more complex models have increased power in the dust spectral index field.
  • ...and 11 more figures