Table of Contents
Fetching ...

Reconstruction of Dark Matter and Baryon Density From Galaxies: A Comparison of Linear, Halo Model and Machine Learning-Based Methods

Jordan Krywonos, Yurii Kvasiuk, Matthew C. Johnson, Moritz Münchmeyer

TL;DR

This study benchmarks methods to reconstruct unobserved dark matter and baryon density fields from galaxy data, using CAMELS simulations to compare linear transfer, halo-model painting, halo ML, and field-level ML. The authors demonstrate that a GNN-CNN hybrid (GNN-CNN) delivers the most accurate field-level reconstructions across mildly non-linear scales, including improved small-scale fidelity for both dark matter and gas. Halo painting improves halo-field reconstruction but does not outperform full-field ML, underscoring the value of end-to-end field-level learning for capturing filamentary structure. The results have practical implications for cross-correlation analyses (e.g., kSZ and lensing) and point to a pathway for robust parameter inference with marginalized baryonic uncertainties using field-level templates.

Abstract

For many analyses in cosmology it is necessary to reconstruct the likely distribution of unobserved fields, such as dark matter or non-luminous baryons, from observed luminous tracers. The dominant approach in cosmology has been to use the so-called halo model, which assumes radially symmetric profiles centered around luminous tracers such as galaxies. More recently, field-level machine learning methods have been proposed that can learn to estimate the unobserved field after being trained on simulations. However, it is unclear whether machine learning methods indeed significantly improve over linear methods or the halo model. In this paper we make a systematic comparison of different approaches to reconstruct dark matter and non-luminous baryons, from galaxy data using the CAMELS simulations. These simulations are in a $25\ \texttt{Mpc/h}$ box, allowing us to compare performance on the mildly non-linear scales $(k\sim 0.4\ \mathrm{h/Mpc})$ down to the size of individual halos. We find the best results using a combined GNN-CNN approach. We also provide a general analysis and visualization of the relationship of matter, non-luminous baryons, halos, and galaxies in these simulations to interpret our results.

Reconstruction of Dark Matter and Baryon Density From Galaxies: A Comparison of Linear, Halo Model and Machine Learning-Based Methods

TL;DR

This study benchmarks methods to reconstruct unobserved dark matter and baryon density fields from galaxy data, using CAMELS simulations to compare linear transfer, halo-model painting, halo ML, and field-level ML. The authors demonstrate that a GNN-CNN hybrid (GNN-CNN) delivers the most accurate field-level reconstructions across mildly non-linear scales, including improved small-scale fidelity for both dark matter and gas. Halo painting improves halo-field reconstruction but does not outperform full-field ML, underscoring the value of end-to-end field-level learning for capturing filamentary structure. The results have practical implications for cross-correlation analyses (e.g., kSZ and lensing) and point to a pathway for robust parameter inference with marginalized baryonic uncertainties using field-level templates.

Abstract

For many analyses in cosmology it is necessary to reconstruct the likely distribution of unobserved fields, such as dark matter or non-luminous baryons, from observed luminous tracers. The dominant approach in cosmology has been to use the so-called halo model, which assumes radially symmetric profiles centered around luminous tracers such as galaxies. More recently, field-level machine learning methods have been proposed that can learn to estimate the unobserved field after being trained on simulations. However, it is unclear whether machine learning methods indeed significantly improve over linear methods or the halo model. In this paper we make a systematic comparison of different approaches to reconstruct dark matter and non-luminous baryons, from galaxy data using the CAMELS simulations. These simulations are in a box, allowing us to compare performance on the mildly non-linear scales down to the size of individual halos. We find the best results using a combined GNN-CNN approach. We also provide a general analysis and visualization of the relationship of matter, non-luminous baryons, halos, and galaxies in these simulations to interpret our results.

Paper Structure

This paper contains 38 sections, 39 equations, 18 figures.

Figures (18)

  • Figure 1: The probability of a halo hosting a central galaxy times the average number density distribution of halo masses (left) and $r_{200}$ radii (right).
  • Figure 2: The auto-power spectra of the galaxy (blue) and dark matter (orange) fields. In green is their cross-correlation.
  • Figure 3: The distribution of galaxies (red points), dark matter halos (blue spheres), dark matter (blue) particles, and gas (green) particles in the simulation volume of $25^3\ (\texttt{Mpc/h})^3$. The radii of the spheres are proportional to $r_{200}$ of the halo.
  • Figure 4: The distribution of galaxies (red dots) that belong to the largest halo in a simulation box. The halo is depicted as a blue sphere. Bottom left and right plots show the distribution of the dark matter and gas particles in the same zoomed-in region (blue and green correspondingly).
  • Figure 5: The top row show methods that reproduce the dark matter field and the bottom row shows either the halo or galaxy field. Top row: truth (left), GNN-CNN (middle) and linear filter (right) dark matter density fields. Bottom row: true halo (left), NFW (middle), and mass-weighted density fields. The fields are 1+overdensity, plotted in a $5\times25\times25\ (\texttt{Mpc/h})^3$ volume, averaged over the $x$ axis. The colorbar is logarithmic with values clipped to $10^{-4}$.
  • ...and 13 more figures