Table of Contents
Fetching ...

IAEmu: Learning Galaxy Intrinsic Alignment Correlations

Sneh Pandya, Yuanyuan Yang, Nicholas Van Alfen, Jonathan Blazek, Robin Walters

TL;DR

IAEmu tackles intrinsic alignments in weak lensing by learning a differentiable emulator that maps HOD and IA parameters to three two-point statistics (ξ, ω, η) and their uncertainties. The model achieves ~3% average error for ξ and ~5% for ω, with η capturing stochasticity while avoiding overfitting, and provides aleatoric and epistemic uncertainties for reliable inference. IAEmu enables ~10^4× GPU speed-ups over CPU-based simulations and supports gradient-based inverse modeling, including successful generalization to IllustrisTNG data for OOD robustness. The results demonstrate readiness for Stage IV surveys, though limitations include reliance on HOD and lack of explicit cosmology dependence, with future directions toward joint HOD-IA inference and field-level or diffusion-based approaches.

Abstract

The intrinsic alignments (IA) of galaxies, a key contaminant in weak lensing analyses, arise from correlations in galaxy shapes driven by tidal interactions and galaxy formation processes. Accurate IA modeling is essential for robust cosmological inference, but current approaches rely on perturbative methods that break down on nonlinear scales or on expensive simulations. We introduce IAEmu, a neural network-based emulator that predicts the galaxy position-position ($ξ$), position-orientation ($ω$), and orientation-orientation ($η$) correlation functions and their uncertainties using mock catalogs based on the halo occupation distribution (HOD) framework. Compared to simulations, IAEmu achieves ~3% average error for $ξ$ and ~5% for $ω$, while capturing the stochasticity of $η$ without overfitting. The emulator provides both aleatoric and epistemic uncertainties, helping identify regions where predictions may be less reliable. We also demonstrate generalization to non-HOD alignment signals by fitting to IllustrisTNG hydrodynamical simulation data. As a fully differentiable neural network, IAEmu enables $\sim$10,000$\times$ speed-ups in mapping HOD parameters to correlation functions on GPUs, compared to CPU-based simulations. This acceleration facilitates inverse modeling via gradient-based sampling, making IAEmu a powerful surrogate model for galaxy bias and IA studies with direct applications to Stage IV weak lensing surveys.

IAEmu: Learning Galaxy Intrinsic Alignment Correlations

TL;DR

IAEmu tackles intrinsic alignments in weak lensing by learning a differentiable emulator that maps HOD and IA parameters to three two-point statistics (ξ, ω, η) and their uncertainties. The model achieves ~3% average error for ξ and ~5% for ω, with η capturing stochasticity while avoiding overfitting, and provides aleatoric and epistemic uncertainties for reliable inference. IAEmu enables ~10^4× GPU speed-ups over CPU-based simulations and supports gradient-based inverse modeling, including successful generalization to IllustrisTNG data for OOD robustness. The results demonstrate readiness for Stage IV surveys, though limitations include reliance on HOD and lack of explicit cosmology dependence, with future directions toward joint HOD-IA inference and field-level or diffusion-based approaches.

Abstract

The intrinsic alignments (IA) of galaxies, a key contaminant in weak lensing analyses, arise from correlations in galaxy shapes driven by tidal interactions and galaxy formation processes. Accurate IA modeling is essential for robust cosmological inference, but current approaches rely on perturbative methods that break down on nonlinear scales or on expensive simulations. We introduce IAEmu, a neural network-based emulator that predicts the galaxy position-position (), position-orientation (), and orientation-orientation () correlation functions and their uncertainties using mock catalogs based on the halo occupation distribution (HOD) framework. Compared to simulations, IAEmu achieves ~3% average error for and ~5% for , while capturing the stochasticity of without overfitting. The emulator provides both aleatoric and epistemic uncertainties, helping identify regions where predictions may be less reliable. We also demonstrate generalization to non-HOD alignment signals by fitting to IllustrisTNG hydrodynamical simulation data. As a fully differentiable neural network, IAEmu enables 10,000 speed-ups in mapping HOD parameters to correlation functions on GPUs, compared to CPU-based simulations. This acceleration facilitates inverse modeling via gradient-based sampling, making IAEmu a powerful surrogate model for galaxy bias and IA studies with direct applications to Stage IV weak lensing surveys.

Paper Structure

This paper contains 16 sections, 21 equations, 11 figures, 2 tables.

Figures (11)

  • Figure 1: Ranges of HOD parameters used in generating the training data from halotools-IA. We generate uniform random values for the four occupation parameters, excluding $\log M_{\text{min}}$. These values are based on a linear relationship with $\log M_{\text{min}}$, serving as a central line. The range for random values extends $4 \cdot \text{RMSE}$ surrounding this line. To clarify the visualization, $\sigma_{\log (M)}$ is displayed separately from other mass variables. Each panel presents published data from Zheng_2007 as a solid line, while the dotted line of the same color illustrates the linear fit to $\log M_{\text{min}}$, with the shaded area indicating the range for uniform random value selection for each parameter. Not shown here are the two alignment parameters, $\mu_{\text{cen}}$ and $\mu_{\text{sat}}$, which both vary uniformly on the range $[-1, 1]$ with no relation to these five occupation parameters.
  • Figure 2: Model Pipeline. The HOD input model parameters are normalized before entering the 7-layer deep multilayer perceptron (MLP) embedding network. The embedding network expands the dimensionality of the input before a bottleneck latent space that transitions to the decoder stage, which features seven 1D convolutional layers which learn the individual local correlations present in the output correlation functions, $\log \xi$, $\omega$, and $\eta$. Both the embedding network and decoder feature residual connections to aid the convergence of IAEmu during training. IAEmu is trained using the $\beta$-NLL loss seitzer2022pitfallsheteroscedasticuncertaintyestimation with a 100 epoch warm-up period corresponding to mean-squared-error optimization before re-introducing aleatoric uncertainties into the optimization. The generated correlation functions are then re-scaled back to their original values. A detailed description of the model training procedure is shown in Appendix \ref{['sec:appendix-training']}. $N$-body simulation visualization in the right panel is from nbody.
  • Figure 3: Top: Average fractional error for the position-position ($\xi$), position-orientation ($\omega$), and orientation-orientation ($\eta$) correlation function predictions in the test set shown in purple. Middle: Median residuals of the test set predictions, expressed in units of the standard deviation of the ground truth data, $\hat{\sigma}$, obtained from 10 realizations used to construct the dataset shown in blue. Bottom: Per-bin Spearman correlation coefficient (SCC, green), normalized root-mean-square error (NRMSE, pink), and symmetric mean absolute percentage error (SMAPE, orange) for the correlation functions. A black vertical dashed line is included in all plots to indicate the transition in $r$ between the 1-halo and 2-halo regimes. It is seen that $\xi$ features a $3\%$ error, on average, and $\omega$ features a $5\%$ error. Though exhibiting a larger fractional error, $\eta$ predictions are on average strictly within 1$\sigma$ of the true uncertainty. This similarly holds for $\omega$, and $\xi$ exhibits a bias at large $r$, reflecting the higher fractional error. Both $\xi$ and $\omega$ exhibit large SCC values and low NRMSE and SMAPE values across all bins, indicating good performance. For $\eta$, the SCC value at low $r$ (SCC $\geq 0.5$) indicates a strong correlation between IAEmu predictions and the ground truth. This gradually decreases at the onset of the 2-halo regime, with the NRMSE and SMAPE performance decreasing as well.
  • Figure 4: Aleatoric vs. epistemic uncertainty comparison for $\omega$ and $\eta$ with uncertainty bias. For test-set predictions, we analyze the total spread of aleatoric uncertainties of the data predicted by IAEmu and epistemic uncertainties due to the stochasticity of IAEmu. The coloring corresponds to the log-residual between IAEmu predicted aleatoric uncertainties and (true) aleatoric uncertainties from halotools-IA produced from the 10 realizations used in producing the dataset. It is seen that the epistemic uncertainty is generally smaller than the aleatoric uncertainty, due to the majority of the scatter falling below the 1:1 line in aleatoric-epistemic uncertainty space. A general bias of 0.42 dex for $\omega$ and 0.24 dex for $\eta$ is observed between the true and predicted aleatoric uncertainties, with IAEmu uncertainty estimates being biased high. This is exacerbated near the 1:1 line, in which the epistemic uncertainty of IAEmu is comparable to the predicted aleatoric uncertainty.
  • Figure 5: Aleatoric vs. epistemic uncertainty comparison for $\omega$ and $\eta$ with correlation amplitude bias. For test-set predictions, we analyze the total spread of aleatoric uncertainties of the data predicted by IAEmu and epistemic uncertainties due to the stochasticity of IAEmu. The coloring corresponds to the log-residual between IAEmu predicted correlation amplitudes and (mean) ground truth amplitudes from halotools-IA produced from the 10 realizations used in producing the dataset. It is seen that there is no clear correlations between residuals in the amplitudes and IAEmu aleatoric and epistemic uncertainties in the case of $\omega$. For $\eta$, it is seen that the sharpest log-residual occurs for predictions in the region where the IAEmu aleatoric uncertainty is $\approx 2$ dex larger than the associated epistemic uncertainties. This can be an instance of IAEmu overfitting, wherein the intrinsic uncertainty of the model on the correlation amplitude is negligible compared to the correlations own uncertainty.
  • ...and 6 more figures