From Redshift to Real Space: Combining Linear Theory With Neural Networks
Edoardo Maragliano, Punyakoti Ganeshaiah Veena, Giulia Degni, Enzo Franco Branchini
TL;DR
This paper tackles redshift-space distortions in large-scale structure analyses by proposing a hybrid LT+NN reconstruction that merges physics-based linear theory with a neural network to map redshift-space halo fields to real space. The LT component corrects large-scale distortions while the NN learns quasi-linear and small-scale corrections, trained on 100 z=1 Quijote halo catalogs, yielding approximately $50\%$ lower MSE than LT alone and $\approx12\%$ lower than NN alone, with a cross-correlation to the true real-space field near unity ($r \approx 0.98$). It also improves two-point statistics, including BAO-scale features, and void measurements, at modest training data and compute cost, demonstrating a synergistic benefit from combining analytical models with machine learning. The results indicate that the LT+NN hybrid can robustly reconstruct real-space fields from redshift-space data and holds promise for application to upcoming wide-field galaxy surveys, subject to validation on more realistic datasets and survey conditions.
Abstract
Spectroscopic redshift surveys are key tools to trace the large-scale structure (LSS) of the Universe and test the $Λ$CDM model. However, using redshifts as distance proxies introduces distortions in the 3D galaxy distribution. If uncorrected, these distortions lead to systematic errors in LSS analyses and cosmological parameter estimation. We present a new method that combines linear theory (LT) and a neural network (NN) to mitigate redshift space distortions (RSDs). The hybrid LT+NN approach is trained and validated on dark matter halo fields from z = 1 snapshots of the Quijote N-body simulations. LT corrects large-scale distortions in the linear regime, while the NN learns quasi-linear and small-scale features. The LT correction is applied first, then the NN is trained on the resulting fields to improve accuracy across scales. The method uses a Mean Squared Error (MSE) loss and yields significant performance gains: approximately 50% improvement over LT alone and 12% over NN alone. The reconstructed fields from the LT+NN method show stronger correlations with the true real-space fields than either LT or NN separately. The hybrid method also improves clustering statistics such as halo-halo and halo-void correlations, with benefits extending to BAO scales. Compared to NN-only, it provides better suppression of spurious anisotropies on large and quasi-linear scales, as measured by the quadrupole moments of correlation functions. This work shows that combining a physically motivated dynamical model with a machine learning algorithm leverages the strengths of both approaches. The LT+NN method achieves high accuracy with modest training data and computational cost, making it a promising tool for future applications to more realistic galaxy surveys.
