Table of Contents
Fetching ...

Machine Learning-Driven Analysis of kSZ Maps to Predict CMB Optical Depth $τ$

Farshid Farhadi Khouzani, Abinash Kumar Shaw, Paul La Plante, Bryar Mustafa Shareef, Laxmi Gewali

TL;DR

The paper tackles the challenge of constraining the CMB optical depth $τ$ from the non-Gaussian kinetic Sunyaev-Zel'dovich (kSZ) signal of the Epoch of Reionization. It advances a machine learning pipeline that uses a Swin Transformer for multi-scale feature extraction from simulated kSZ maps and employs the Laplace Approximation to deliver principled uncertainty quantification, comparing post-hoc and online deployment modes. A seminumeric pipeline generates 1,000 map–$τ$ pairs across diverse reionization histories, enabling rigorous evaluation; the post-hoc LA model achieves high predictive accuracy (e.g., $R^2\approx0.93$) with well-calibrated uncertainties, outperforming the online LA variant. The results demonstrate a robust method to extract $τ$ from kSZ data and pave the way for applying these techniques to real CMB surveys, where reliable error bars are essential for cosmological inference.

Abstract

Upcoming measurements of the kinetic Sunyaev-Zel'dovich (kSZ) effect, which results from Cosmic Microwave Background (CMB) photons scattering off moving electrons, offer a powerful probe of the Epoch of Reionization (EoR). The kSZ signal contains key information about the timing, duration, and spatial structure of the EoR. A precise measurement of the CMB optical depth $τ$, a key parameter that characterizes the universe's integrated electron density, would significantly constrain models of early structure formation. However, the weak kSZ signal is difficult to extract from CMB observations due to significant contamination from astrophysical foregrounds. We present a machine learning approach to extract $τ$ from simulated kSZ maps. We train advanced machine learning models, including swin transformers, on high-resolution seminumeric simulations of the kSZ signal. To robustly quantify prediction uncertainties of $τ$, we employ the Laplace Approximation (LA). This approach provides an efficient and principled Gaussian approximation to the posterior distribution over the model's weights, allowing for reliable error estimation. We investigate and compare two distinct application modes: a post-hoc LA applied to a pre-trained model, and an online LA where model weights and hyperparameters are optimized jointly by maximizing the marginal likelihood. This approach provides a framework for robustly constraining $τ$ and its associated uncertainty, which can enhance the analysis of upcoming CMB surveys like the Simons Observatory and CMB-S4.

Machine Learning-Driven Analysis of kSZ Maps to Predict CMB Optical Depth $τ$

TL;DR

The paper tackles the challenge of constraining the CMB optical depth from the non-Gaussian kinetic Sunyaev-Zel'dovich (kSZ) signal of the Epoch of Reionization. It advances a machine learning pipeline that uses a Swin Transformer for multi-scale feature extraction from simulated kSZ maps and employs the Laplace Approximation to deliver principled uncertainty quantification, comparing post-hoc and online deployment modes. A seminumeric pipeline generates 1,000 map– pairs across diverse reionization histories, enabling rigorous evaluation; the post-hoc LA model achieves high predictive accuracy (e.g., ) with well-calibrated uncertainties, outperforming the online LA variant. The results demonstrate a robust method to extract from kSZ data and pave the way for applying these techniques to real CMB surveys, where reliable error bars are essential for cosmological inference.

Abstract

Upcoming measurements of the kinetic Sunyaev-Zel'dovich (kSZ) effect, which results from Cosmic Microwave Background (CMB) photons scattering off moving electrons, offer a powerful probe of the Epoch of Reionization (EoR). The kSZ signal contains key information about the timing, duration, and spatial structure of the EoR. A precise measurement of the CMB optical depth , a key parameter that characterizes the universe's integrated electron density, would significantly constrain models of early structure formation. However, the weak kSZ signal is difficult to extract from CMB observations due to significant contamination from astrophysical foregrounds. We present a machine learning approach to extract from simulated kSZ maps. We train advanced machine learning models, including swin transformers, on high-resolution seminumeric simulations of the kSZ signal. To robustly quantify prediction uncertainties of , we employ the Laplace Approximation (LA). This approach provides an efficient and principled Gaussian approximation to the posterior distribution over the model's weights, allowing for reliable error estimation. We investigate and compare two distinct application modes: a post-hoc LA applied to a pre-trained model, and an online LA where model weights and hyperparameters are optimized jointly by maximizing the marginal likelihood. This approach provides a framework for robustly constraining and its associated uncertainty, which can enhance the analysis of upcoming CMB surveys like the Simons Observatory and CMB-S4.

Paper Structure

This paper contains 15 sections, 9 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: An example of a simulated kSZ map used in this work. The colors represent the temperature fluctuation in the CMB in units of microkelvin ($\mu$K). These fluctuations are caused by the scattering of CMB photons off of ionized bubbles moving with a peculiar velocity during the Epoch of Reionization. This map serves as a single input image for our machine learning models.
  • Figure 2: The end-to-end architecture of the Swin Transformer model used in this work for regressing the optical depth, $\tau$. An input kSZ map is first preprocessed to a size of $224 \times 224$ pixels. The image is divided into non-overlapping $4 \times 4$ patches and linearly embedded into a 96-dimensional feature space. This is followed by a four-stage Swin Transformer backbone, where patch merging layers progressively downsample the spatial resolution (from $56 \times 56$ to $7 \times 7$) while increasing the feature dimension (from 96 to 768). The output feature vector is then passed to a Multi-Layer Perceptron (MLP) regression head, which consists of three hidden layers with 95, 210, and 202 neurons, respectively, before a final output neuron produces the point estimate for $\tau$. For the post-hoc configuration, this entire trained model is passed to the Laplace Approximation library to compute a posterior distribution over the MLP head weights, which provides the final uncertainty on the $\tau$ prediction.
  • Figure 3: Illustration of the shifted-window self-attention mechanism, the core component of the Swin Transformer, shown here applied to a simulated kSZ map from our data set. In a given layer $l$ (left), the map is partitioned into regular, non-overlapping windows, and self-attention is computed only among the patches within each window. In the subsequent layer $l+1$ (right), the window grid is shifted before partitioning. This new configuration forces the self-attention calculation to cross the boundaries of the previous windows, allowing for information and features to be exchanged between them. This process is the key innovation that enables the model to learn features at multiple spatial scales in our kSZ data. Figure adapted from 2021arXiv210314030L.
  • Figure 4: Training and validation loss (Mean Squared Error) for the best-performing post-hoc model as a function of training epoch. The early stopping mechanism halted the training when the validation loss no longer improved, preventing overfitting.
  • Figure 5: Visual comparison of model performance on the test set. The top row (a, b) shows results from the post-hoc Laplace model, while the bottom row (c, d) shows results for the online Laplace model. The left column (a, c) displays scatter plots of predicted vs. true $\tau$, and the right column (b, d) includes one-sigma predictive error bars.