A modern halo streaming model for redshift space distortions

Cheng-Zong Ruan; Baojiu Li; Carlton M. Baugh; Sownak Bose; Alexander Eggemeier; David F. Mota

A modern halo streaming model for redshift space distortions

Cheng-Zong Ruan, Baojiu Li, Carlton M. Baugh, Sownak Bose, Alexander Eggemeier, David F. Mota

Abstract

Accurate modelling of redshift-space distortions (RSD) in galaxy clustering is essential for extracting cosmological information from current and forthcoming large-scale structure surveys. While perturbation theory is reliable on large scales, much of the constraining power lies at intermediate and small separations, where nonlinear dynamics within and between dark matter haloes dominate. We present a halo streaming model for nonlinear galaxy clustering in redshift space that is accurate and physically interpretable. Our framework combines the streaming model for RSD with a halo-model decomposition of the galaxy clustering into central/satellite and one-/two-halo contributions. We build dedicated emulators for the key physical ingredients, trained on a suite of $N$-body simulations: halo mass functions, real-space halo two-point correlation functions, and pairwise velocity moments. By emulating these modular building blocks rather than the final redshift-space observable, this approach preserves physical transparency, enables targeted optimisation for each ingredient, and remains flexible to changes in tracer populations and galaxy-halo connection models. The resulting halo streaming model reproduces the simulated nonlinear anisotropic clustering signal down to highly nonlinear scales, while achieving the computational efficiency required for cosmological parameter inference. This framework is designed to support full-shape RSD analyses for surveys such as DESI and \textit{Euclid}, facilitating precision measurements of structure growth and tests of gravity. All codes and trained emulators are publicly available in the \href{https://github.com/chzruan/freyja}{\texttt{freyja}} repository.

A modern halo streaming model for redshift space distortions

Abstract

-body simulations: halo mass functions, real-space halo two-point correlation functions, and pairwise velocity moments. By emulating these modular building blocks rather than the final redshift-space observable, this approach preserves physical transparency, enables targeted optimisation for each ingredient, and remains flexible to changes in tracer populations and galaxy-halo connection models. The resulting halo streaming model reproduces the simulated nonlinear anisotropic clustering signal down to highly nonlinear scales, while achieving the computational efficiency required for cosmological parameter inference. This framework is designed to support full-shape RSD analyses for surveys such as DESI and \textit{Euclid}, facilitating precision measurements of structure growth and tests of gravity. All codes and trained emulators are publicly available in the \href{https://github.com/chzruan/freyja}{\texttt{freyja}} repository.

Paper Structure (23 sections, 45 equations, 10 figures, 1 table)

This paper contains 23 sections, 45 equations, 10 figures, 1 table.

Introduction
Simulations and mock catalogues
The DEGRACE-pilot simulation suite
HOD prescriptions
HOD Variants and Extensions
The halo streaming model of redshift-space clustering
The streaming model for RSD
One-halo terms
Two-halo terms
The skew-T parameterisation of the pairwise velocity distribution
Emulators for the model ingredients
Halo mass functions
Training Data and Input Space
Gaussian Process Implementation
The high-mass tail of halo mass functions
...and 8 more sections

Figures (10)

Figure 1: Visualisation of the cosmological parameter space, $\mathcal{C}$, covered by the DEGRACE-pilot simulation suite. The suite includes $64$ cosmological models, of which models $1$-$54$ are used for training and $55$-$64$ are reserved for testing. https://github.com/chzruan/freyja/blob/main/paper_figs/cosmo_params.py
Figure 2: Validation of the cumulative halo mass function (cHMF) emulator against the test data set. Top panel: The cumulative halo mass functions for different cosmological models (colour-coded). Points show the ground truth from simulations, solid lines represent the emulator predictions, and dashed lines show the Schechter-fitting extrapolation at the high-mass end. Bottom panel: The relative percentage residual between the emulator prediction and the simulation data. https://github.com/chzruan/freyja/blob/main/paper_figs/hmf_gp.py
Figure 3: Validation of the matter power spectrum emulator. Top panel: comparison between emulator predictions (solid lines) and simulation measurements (points) for the cosmologies in the test data set. Colours encode different cosmological models. Bottom panel: relative difference between emulator and simulation results. The emulator reproduces the nonlinear matter power spectrum with an accuracy of approximately $1\%$ across the scale range used in the analysis. https://github.com/chzruan/freyja/blob/main/paper_figs/matter_pk_emulator.py
Figure 4: Linear halo bias as a function of halo mass at $z=0.25$ for the fiducial Planck cosmology Planck15Parameters:2016AA...594A..13P. The purple stars represent measurements from a subset of five realisations ($N_{\text{box}}=5$), mimicking the noise level of the training data set. The red points show the ground truth derived from the full suite of $100$ realisations. The black curve follows the Tinker:2010ApJ...724..878T bias model: the solid section marks the range used to fit the normalisation parameters, while the dashed section to the right of the vertical line represents the extrapolation to higher masses The vertical grey dashed line indicates the upper mass limit of the fitting region. The extrapolation demonstrates excellent agreement with the high-mass validation data. https://github.com/chzruan/freyja/blob/main/paper_figs/bias_tinker_extrapolation.py
Figure 5: Comparison between the simulated and emulated halo two-point correlation function for a representative cosmological model in the test data set. The upper panel shows the real-space correlation function $\xi_{\text{hh}}(r)$ scaled by $r^{2}$ for haloes with $M \ge 10^{12.5}\,h^{-1}M_{\odot}$. Red stars denote measurements from the $N$-body simulations, while the solid black curve shows the emulator prediction. The thin blue dashed curve corresponds to the large-scale linear bias approximation. The final prediction smoothly stitches the small-scale emulator output to the linear-theory limit to improve signal-to-noise on large scales. The lower panel shows the relative difference between emulator and simulations. Sub-percent agreement is achieved over the range where $\xi_{\text{hh}}(r)$ is well measured; apparent large deviations at the largest separations arise because $\xi_{\text{hh}}(r)$ approaches zero. https://github.com/chzruan/freyja/paper_figs/bias_tinker_extrapolation.py
...and 5 more figures

A modern halo streaming model for redshift space distortions

Abstract

A modern halo streaming model for redshift space distortions

Authors

Abstract

Table of Contents

Figures (10)