Learning Galaxy Intrinsic Alignment Correlations
Sneh Pandya, Yuanyuan Yang, Nicholas Van Alfen, Jonathan Blazek, Robin Walters
TL;DR
The paper tackles the challenge of modeling galaxy intrinsic alignments (IA), which contaminate weak-lensing signals, by building a deep learning emulator that maps a 7D HOD parameter vector to IA correlation functions $ξ(r)$, $ω(r)$, and $η(r)$ and their aleatoric uncertainties. It uses an encoder–decoder neural network with mean-variance estimation and Monte Carlo dropout, trained on IA-augmented halo catalogs generated with halotools, to produce fast, joint predictions across all three statistics. The results show $ξ(r)$ predictions reaching $\leq 10\%$ accuracy and strong correlation with ground truth (PCC$\approx$0.98 for $ξ$, 0.88 for $ω$, 0.65 for $η$), while $ω(r)$ and $η(r)$ remain noisier but empirically captured with calibrated uncertainties. This emulator dramatically accelerates IA modeling and enables efficient Monte Carlo inference for cosmology, with open-source code planned, thereby facilitating robust weak-lensing analyses and validation against larger simulations.
Abstract
The intrinsic alignments (IA) of galaxies, regarded as a contaminant in weak lensing analyses, represents the correlation of galaxy shapes due to gravitational tidal interactions and galaxy formation processes. As such, understanding IA is paramount for accurate cosmological inferences from weak lensing surveys; however, one limitation to our understanding and mitigation of IA is expensive simulation-based modeling. In this work, we present a deep learning approach to emulate galaxy position-position ($ξ$), position-orientation ($ω$), and orientation-orientation ($η$) correlation function measurements and uncertainties from halo occupation distribution-based mock galaxy catalogs. We find strong Pearson correlation values with the model across all three correlation functions and further predict aleatoric uncertainties through a mean-variance estimation training procedure. $ξ(r)$ predictions are generally accurate to $\leq10\%$. Our model also successfully captures the underlying signal of the noisier correlations $ω(r)$ and $η(r)$, although with a lower average accuracy. We find that the model performance is inhibited by the stochasticity of the data, and will benefit from correlations averaged over multiple data realizations. Our code will be made open source upon journal publication.
