Primordial non-Gaussianity -- Fast simulations and persistent summary statistics
Juan Calles, Gabriella Contardo, Jorge Noreña, Jacky H. T. Yip, Gary Shiu
TL;DR
This work probes how topological descriptors from persistent homology and traditional clustering statistics constrain primordial non-Gaussianity (PNG) in large-scale structure, using two simulation suites (PNG-pmwd and QuijotePNG) and likelihood-free neural regression. The authors introduce PNG-pmwd with 22,410 halo catalogs across local and equilateral PNG shapes and varied cosmology, enabling a broad comparison of statistics across halo-mass bins. They find that PD-statistics, a simple topological descriptor, typically yields the strongest constraints for both $f_{ m NL}^{\rm loc}$ and $f_{ m NL}^{\rm equil}$, with large halos carrying most of the information; including small halos or small scales can degrade performance and hinder transferability between simulators. Transferability tests reveal that models trained on fast simulations can generalize to full simulations only when small-scale modes and low-mass halos are omitted, highlighting the need for careful handling of resolution differences and standardization when applying learned mappings to different datasets.
Abstract
We investigate the sensitivity of topological and traditional summary statistics to primordial non-Gaussianity (PNG) using two suites of simulations. First, we introduce a new simulation suite for PNG, PNG-pmwd, comprising more than $20{,}000$ halo catalogs that vary individually local and equilateral shapes, together with variations in $Ω_m$ and $σ_8$. Second, we carry out a systematic comparison of topological descriptors, as well as powerspectrum and bispectrum measurements, evaluating their constraining power on both local and equilateral $f_{\rm NL}$ and how this sensitivity varies with halo mass. This dataset enables likelihood-free neural regression of $f_{\rm NL}$ across multiple halo mass bins for a wide range of summary statistics. Third, we assess the transferability of these learned mappings by testing whether models trained on fast pmwd simulations can robustly infer on simulations from the QuijotePNG suite. We find that a combination of simple descriptive statistics of the topological features (PD-statistics) leads to the best performance to constrain equilateral PNG. We observe that the constraining power of these summaries comes from large-mass halos, with small-mass halos adding noise and degrading performance. Similarly, we find that the transferability of the learned mappings, for both topological and powerspectrum plus bispectrum, degrades if small scales or small-mass halos are included.
