Table of Contents
Fetching ...

LIMFAST. IV. Learning High-Redshift Galaxy Formation from Multiline Intensity Mapping with Implicit Likelihood Inference

Guochao Sun, Tri Nguyen, Claude-André Faucher-Giguère, Adam Lidz, Tjitske Starkenburg, Bryan R. Scott, Tzu-Ching Chang, Steven R. Furlanetto

TL;DR

This paper develops an implicit likelihood inference framework to constrain high-redshift galaxy formation using multi-tracer line intensity mapping of [$\mathrm{CII}$] and [$\mathrm{OIII}$]. It combines LIMFAST semi-numerical simulations with neural density estimation via normalizing flows to learn the mapping from auto- and cross-power spectra to physical parameters governing the star formation efficiency and the $\dot{\Sigma}_{\star}$-$\Sigma_{g}$ relation. The results show that jointly modeling both lines, including cross-correlations with Roman LBGs, breaks degeneracies and yields tight, unbiased constraints on $\xi$ and $\zeta$, with validated posteriors through robust calibration tests. This framework demonstrates the potential of multi-tracer LIM for revealing the physics of early galaxy formation and motivates further development to include systematics, field-level analyses, and joint cosmological inference.

Abstract

By opening up new avenues to statistically constrain astrophysics and cosmology with large-scale structure observations, the line intensity mapping (LIM) technique calls for novel tools for efficient forward modeling and inference. Implicit likelihood inference (ILI) from semi-numerical simulations provides a powerful setup for investigating a large model parameter space in a data-driven manner, therefore gaining significant recent attention. Using simulations of high-redshift 158$μ$m [CII] and 88$μ$m [OIII] LIM signals created by the LIMFAST code, we develop an ILI framework in a case study of learning the physics of early galaxy formation from the auto-power spectra of these lines or their cross-correlation with galaxy surveys. We leverage neural density estimation with normalizing flows to learn the mapping between the simulated power spectra and parameters that characterize the physics governing the star formation efficiency and the $\dotΣ_{\star}$-$Σ_\mathrm{g}$ relation of high-redshift galaxies. Our results show that their partially degenerate effects can be unambiguously constrained when combining [CII] with [OIII] measurements to be made by new-generation mm/sub-mm LIM experiments.

LIMFAST. IV. Learning High-Redshift Galaxy Formation from Multiline Intensity Mapping with Implicit Likelihood Inference

TL;DR

This paper develops an implicit likelihood inference framework to constrain high-redshift galaxy formation using multi-tracer line intensity mapping of [] and []. It combines LIMFAST semi-numerical simulations with neural density estimation via normalizing flows to learn the mapping from auto- and cross-power spectra to physical parameters governing the star formation efficiency and the - relation. The results show that jointly modeling both lines, including cross-correlations with Roman LBGs, breaks degeneracies and yields tight, unbiased constraints on and , with validated posteriors through robust calibration tests. This framework demonstrates the potential of multi-tracer LIM for revealing the physics of early galaxy formation and motivates further development to include systematics, field-level analyses, and joint cosmological inference.

Abstract

By opening up new avenues to statistically constrain astrophysics and cosmology with large-scale structure observations, the line intensity mapping (LIM) technique calls for novel tools for efficient forward modeling and inference. Implicit likelihood inference (ILI) from semi-numerical simulations provides a powerful setup for investigating a large model parameter space in a data-driven manner, therefore gaining significant recent attention. Using simulations of high-redshift 158m [CII] and 88m [OIII] LIM signals created by the LIMFAST code, we develop an ILI framework in a case study of learning the physics of early galaxy formation from the auto-power spectra of these lines or their cross-correlation with galaxy surveys. We leverage neural density estimation with normalizing flows to learn the mapping between the simulated power spectra and parameters that characterize the physics governing the star formation efficiency and the - relation of high-redshift galaxies. Our results show that their partially degenerate effects can be unambiguously constrained when combining [CII] with [OIII] measurements to be made by new-generation mm/sub-mm LIM experiments.

Paper Structure

This paper contains 17 sections, 21 equations, 12 figures, 2 tables.

Figures (12)

  • Figure 1: The predicted gas mass--halo mass relation (left), stellar mass--halo mass relation (middle), and mass-metallicity relation (MZR, right) at $z=6$ under varying assumptions of the mass loading factor ($\xi$ and $\xi_z$) and $\dot{\Sigma}_{\star}$--$\Sigma_\mathrm{g}$ relation ($\zeta$) parameters. Note that the stellar mass content is set by the feedback strength, e.g., momentum-driven ("M") or energy-driven ("E"), and largely independent of the $\dot{\Sigma}_{\star}$--$\Sigma_\mathrm{g}$ relation ("KS" or "FQH13"), which primarily affects the gas mass content. The insensitivity of the MZR to the $\dot{\Sigma}_{\star}$--$\Sigma_\mathrm{g}$ relation is then a necessary consequence of the metal mass production at equilibrium. As sanity checks, we plot observational constraints on the star formation efficiency at $z\sim6$ from HST Stefanon2021 and JWST Shuntov2025, along with the high-$z$ MZR predicted by the FIRE simulations Marszewski2024, which is in close agreement with the latest JWST observations Nakajima2023Chemerynska2024Curti2024.
  • Figure 2: Scaling relations of [$\mathrm{C\,\textsc{ii}}$] and [$\mathrm{O\,\textsc{iii}}$] luminosities as a function of the halo mass (left) or the SFR (right) at $z = 6$ for the same model variations as in figure \ref{['fig:halo_properties']}. Note how the [$\mathrm{C\,\textsc{ii}}$] luminosity depends on both the feedback mode and the $\dot{\Sigma}_{\star}$--$\Sigma_\mathrm{g}$ relation as a result of their effects on the gas mass, whereas the [$\mathrm{O\,\textsc{iii}}$] luminosity is almost independent of the latter due to the lack of sensitivity of the SFR to $\zeta$. For comparison, the predictions of some other models in the literature are shown by the gray dashed curves, which provide good fits to the latest ALMA observations Lagache2018Harikane2020.
  • Figure 3: Top: shot-noise-included power spectra of [$\mathrm{C\,\textsc{ii}}$] and [$\mathrm{O\,\textsc{iii}}$] (left), their cross-correlations with the Roman LBGs (middle), and the LBGs themselves (right) at $z = 7$ for an example model with $\xi \approx 1/3$, $\xi_z \approx 0$, and $\zeta \approx 1.4$. The clustering and shot-noise components of [$\mathrm{C\,\textsc{ii}}$] and [$\mathrm{O\,\textsc{iii}}$] power spectra are indicated by the dashed and dotted curves, respectively. Bottom: mock [$\mathrm{C\,\textsc{ii}}$] and [$\mathrm{O\,\textsc{iii}}$] intensity maps at $z = 7$. Locations of LBGs detectable for a moderate depth survey by the Roman ($m_\mathrm{AB, lim} < 28.2$) are marked by the green dots, which correlate with the overdensities traced by the line intensity maps.
  • Figure 4: A schematic visualization of the workflow for the training process in the ILI framework presented. A neural posterior estimation (NPE) model is trained to learn the joint posterior of galaxy formation parameter vector $\boldsymbol{\theta} = (\xi, \xi_z, \zeta)$ from [$\mathrm{C\,\textsc{ii}}$] and [$\mathrm{O\,\textsc{iii}}$] auto-power spectra or their cross-power spectra with LBGs simulated by LIMFAST. The NPE model extracts informative features $(s)$ from power spectrum data using an embedding network (GRU; see appendix \ref{['sec:ml']} for detail) and use them to condition a normalizing flow (NF) that learns the target posterior by applying a sequence of invertible transforms $(\mathcal{T})$ to a simple base distribution $(u)$. Uniform priors are assumed for the parameters of interest. During inference, power spectra are supplied to the NPE model and the posterior is sampled without re-training.
  • Figure 5: Comparison between the high-$z$ cosmic SFRDs sampled from our three-dimensional parameter space of interest (the 16--84th percentile) and the predictions from pre- and post-JWST empirical models MD2014Harikane2022Donnan2023. The dark shaded and light hatched regions are integrated down to halos with virial temperature $T_\mathrm{vir}=10^4\,$K and absolute UV magnitude $M_\mathrm{UV} \simeq -17$, respectively, the latter of which corresponds to the detection limit assumed by the results from the literature. Note that, unlike the two post-JWST models that are constrained by measurements extending to $z>10$, MD2014 rely on extrapolation beyond $z>8$.
  • ...and 7 more figures