A Systematic Literature Review of Machine Learning Techniques for Observational Constraints in Cosmology
Luis Rojas, Sebastián Espinoza, Esteban González, Carlos Maldonado, Fei Luo
TL;DR
This systematic literature review assesses how machine learning techniques have been applied to derive observational cosmological constraints, focusing on Bayesian parameter estimation and the accuracy of posterior inferences. It analyzes common data probes (e.g., CMB, BAO, SNe Ia, OHD) and ML approaches (dominated by neural networks and deep learning, with Bayesian neural networks and Gaussian processes offering uncertainty advantages) and documents substantial speedups from ML surrogates and emulators. The review identifies that while ML methods frequently accelerate inference, their calibration and reliability can lag behind traditional MCMC approaches, highlighting a tension between efficiency and precision. It further notes a strong emphasis on addressing the $H_0$ tension, a need for broader cosmological model testing and dataset usage, and the importance of standardized benchmarking and physics-informed architectures to enable robust, reproducible inference for upcoming surveys such as LSST and Euclid.
Abstract
This paper presents a systematic literature review focusing on the application of machine learning techniques for deriving observational constraints in cosmology. The goal is to evaluate and synthesize existing research to identify effective methodologies, highlight gaps, and propose future research directions. Our review identifies several key findings: (1) various machine learning techniques, including Bayesian neural networks, Gaussian processes, and deep learning models, have been applied to cosmological data analysis, improving parameter estimation and handling large datasets. However, models achieving significant computational speedups often exhibit worse confidence regions compared to traditional methods, emphasizing the need for future research to enhance both efficiency and measurement precision. (2) Traditional cosmological methods, such as those using Type Ia Supernovae, baryon acoustic oscillations, and cosmic microwave background data, remain fundamental, but most studies focus narrowly on specific datasets. We recommend broader dataset usage to fully validate alternative cosmological models. (3) The reviewed studies mainly address the $H_0$ tension, leaving other cosmological challenges-such as the cosmological constant problem, warm dark matter, phantom dark energy, and others-unexplored. (4) Hybrid methodologies combining machine learning with Markov chain Monte Carlo offer promising results, particularly when machine learning techniques are used to solve differential equations, such as Einstein Boltzmann solvers, as prior to Markov chain Monte Carlo models, accelerating computations while maintaining precision. (5) There is a significant need for standardized evaluation criteria and methodologies, as variability in training processes and experimental setups complicates result comparability and reproducibility (abridged).
