CosmoGLINT: Cosmological Generative Model for Line Intensity Mapping with Transformer
Kana Moriwaki, Rui Lan Jun, Ken Osato, Naoki Yoshida
TL;DR
CosmoGLINT introduces a Transformer-based autoregressive generator trained on hydrodynamic simulations to populate DM-only haloes with galaxies, producing properties such as SFR, offsets, and velocities conditioned on halo mass $M$. It reproduces key LIM statistics, including voxel SFR distributions and real/redshift-space power spectra, and can produce multiple mock realizations and lightcones by applying the learned distributions to DM-only halo catalogues. The approach is demonstrated on IllustrisTNG data and extended to large-volume DM-only runs (e.g., Pinocchio) with halo-mass rescaling to mimic baryonic effects, enabling realistic, scalable LIM mocks for current and future surveys. Limitations related to subgrid physics and extrapolation to very massive haloes are discussed, with potential extensions to environment, concentration, metallicity, and multi-line emission to enhance realism and cross-survey analyses.
Abstract
Modelling star-forming galaxies is crucial for upcoming observations of large-scale matter and galaxy distributions with galaxy redshift surveys and line intensity mapping (LIM). We introduce CosmoGLINT (Cosmological Generative model for Line INtensity mapping with Transformer), a Transformer-based generative framework designed to create realistic galaxy populations from dark matter (DM)-only simulations. CosmoGLINT auto-regressively generates sequences of galaxy properties -- including star formation rate (SFR), distance to the halo centre, and radial and tangential velocities relative to the halo -- conditioned on halo mass. Trained on the IllustrisTNG hydrodynamic simulation, the model reproduces key statistical properties of the original data, including the voxel intensity distribution and the power spectrum both in real and redshift space. It can efficiently generate a number of different realisations of the designated galaxy populations, enabling the creation of mock LIM/redshift survey catalogues from large halo catalogues produced by fast DM-only simulations. We show that our model trained at multiple redshifts can be applied to DM halo lightcone data to generate a realistic mock galaxy lightcone with incorporating the redshift evolution of the galaxy population. The mock catalogues can be readily used to derive statistical quantities and to develop data analysis pipelines for ongoing and future wide-field surveys.
