Reconstruction of Dark Matter and Baryon Density From Galaxies: A Comparison of Linear, Halo Model and Machine Learning-Based Methods
Jordan Krywonos, Yurii Kvasiuk, Matthew C. Johnson, Moritz Münchmeyer
TL;DR
This study benchmarks methods to reconstruct unobserved dark matter and baryon density fields from galaxy data, using CAMELS simulations to compare linear transfer, halo-model painting, halo ML, and field-level ML. The authors demonstrate that a GNN-CNN hybrid (GNN-CNN) delivers the most accurate field-level reconstructions across mildly non-linear scales, including improved small-scale fidelity for both dark matter and gas. Halo painting improves halo-field reconstruction but does not outperform full-field ML, underscoring the value of end-to-end field-level learning for capturing filamentary structure. The results have practical implications for cross-correlation analyses (e.g., kSZ and lensing) and point to a pathway for robust parameter inference with marginalized baryonic uncertainties using field-level templates.
Abstract
For many analyses in cosmology it is necessary to reconstruct the likely distribution of unobserved fields, such as dark matter or non-luminous baryons, from observed luminous tracers. The dominant approach in cosmology has been to use the so-called halo model, which assumes radially symmetric profiles centered around luminous tracers such as galaxies. More recently, field-level machine learning methods have been proposed that can learn to estimate the unobserved field after being trained on simulations. However, it is unclear whether machine learning methods indeed significantly improve over linear methods or the halo model. In this paper we make a systematic comparison of different approaches to reconstruct dark matter and non-luminous baryons, from galaxy data using the CAMELS simulations. These simulations are in a $25\ \texttt{Mpc/h}$ box, allowing us to compare performance on the mildly non-linear scales $(k\sim 0.4\ \mathrm{h/Mpc})$ down to the size of individual halos. We find the best results using a combined GNN-CNN approach. We also provide a general analysis and visualization of the relationship of matter, non-luminous baryons, halos, and galaxies in these simulations to interpret our results.
