Field-level simulation-based inference with galaxy catalogs: the impact of systematic effects
Natalí S. M. de Santi, Francisco Villaescusa-Navarro, L. Raul Abramo, Helen Shao, Lucia A. Perez, Tiago Castro, Yueying Ni, Christopher C. Lovell, Elena Hernandez-Martinez, Federico Marinacci, David N. Spergel, Klaus Dolag, Lars Hernquist, Mark Vogelsberger
TL;DR
This work extends field-level likelihood-free inference for cosmology from galaxy catalogs by incorporating realistic observational systematics into a graph-neural-network framework. Using thousands of CAMELS hydrodynamic simulations, it builds galaxy graphs and predicts the posterior mean and uncertainty of $Ω_{\rm m}$ via moment neural networks, testing robustness to masking, velocity and distance errors, and galaxy selection criteria. The results show that the approach remains robust across most systematics, with over 90% of catalogs maintaining high performance after outlier removal, though certain effects (notably large velocity perturbations and some selections) degrade accuracy in some simulations such as Magneticum. This demonstrates the potential of applying field-level GNN inference to real galaxy data, while highlighting the need for larger-volume simulations and broader parameter coverage to fully realize its cosmological constraining power.
Abstract
It has been recently shown that a powerful way to constrain cosmological parameters from galaxy redshift surveys is to train graph neural networks to perform field-level likelihood-free inference without imposing cuts on scale. In particular, de Santi et al. (2023) developed models that could accurately infer the value of $Ω_{\rm m}$ from catalogs that only contain the positions and radial velocities of galaxies that are robust to uncertainties in astrophysics and subgrid models. However, observations are affected by many effects, including 1) masking, 2) uncertainties in peculiar velocities and radial distances, and 3) different galaxy selections. Moreover, observations only allow us to measure redshift, intertwining galaxies' radial positions and velocities. In this paper we train and test our models on galaxy catalogs, created from thousands of state-of-the-art hydrodynamic simulations run with different codes from the CAMELS project, that incorporate these observational effects. We find that, although the presence of these effects degrades the precision and accuracy of the models, and increases the fraction of catalogs where the model breaks down, the fraction of galaxy catalogs where the model performs well is over 90 %, demonstrating the potential of these models to constrain cosmological parameters even when applied to real data.
