Table of Contents
Fetching ...

A precise symbolic emulator of the linear matter power spectrum

Deaglan J. Bartlett, Lukas Kammerer, Gabriel Kronberger, Harry Desmond, Pedro G. Ferreira, Benjamin D. Wandelt, Bogdan Burlacu, David Alonso, Matteo Zennaro

TL;DR

This paper tackles the slow evaluation of the linear matter power spectrum across cosmologies by building analytic, interpretable emulators via symbolic regression. By modeling the residual between the physically motivated Eisenstein & Hu approximation and Boltzmann-solver results, the authors derive compact expressions for both $P(k)$ and $\sigma_8$ that reach sub-percent accuracy over wide parameter ranges. The $\sigma_8$ emulator achieves $\sim 0.1\%$ RMSE and is readily invertible to obtain $A_{\rm s}$, while the linear $P(k)$ emulator attains $\lesssim 0.2\%$ RMSE in $P(k)$ with a transparent, BAO-aware functional structure, delivering speedups up to $\sim 10^3\times$ over CAMB and $\sim 36\times$ over BACCO. The work emphasizes interpretability and longevity, showing that analytic, physics-informed expressions can rival neural networks for current and future cosmological analyses, with clear paths for extending to non-linear regimes and additional physics.

Abstract

Computing the matter power spectrum, $P(k)$, as a function of cosmological parameters can be prohibitively slow in cosmological analyses, hence emulating this calculation is desirable. Previous analytic approximations are insufficiently accurate for modern applications, so black-box, uninterpretable emulators are often used. We utilise an efficient genetic programming based symbolic regression framework to explore the space of potential mathematical expressions which can approximate the power spectrum and $σ_8$. We learn the ratio between an existing low-accuracy fitting function for $P(k)$ and that obtained by solving the Boltzmann equations and thus still incorporate the physics which motivated this earlier approximation. We obtain an analytic approximation to the linear power spectrum with a root mean squared fractional error of 0.2% between $k = 9\times10^{-3} - 9 \, h{\rm \, Mpc^{-1}}$ and across a wide range of cosmological parameters, and we provide physical interpretations for various terms in the expression. Our analytic approximation is 950 times faster to evaluate than camb and 36 times faster than the neural network based matter power spectrum emulator BACCO. We also provide a simple analytic approximation for $σ_8$ with a similar accuracy, with a root mean squared fractional error of just 0.1% when evaluated across the same range of cosmologies. This function is easily invertible to obtain $A_{\rm s}$ as a function of $σ_8$ and the other cosmological parameters, if preferred. It is possible to obtain symbolic approximations to a seemingly complex function at a precision required for current and future cosmological analyses without resorting to deep-learning techniques, thus avoiding their black-box nature and large number of parameters. Our emulator will be usable long after the codes on which numerical approximations are built become outdated.

A precise symbolic emulator of the linear matter power spectrum

TL;DR

This paper tackles the slow evaluation of the linear matter power spectrum across cosmologies by building analytic, interpretable emulators via symbolic regression. By modeling the residual between the physically motivated Eisenstein & Hu approximation and Boltzmann-solver results, the authors derive compact expressions for both and that reach sub-percent accuracy over wide parameter ranges. The emulator achieves RMSE and is readily invertible to obtain , while the linear emulator attains RMSE in with a transparent, BAO-aware functional structure, delivering speedups up to over CAMB and over BACCO. The work emphasizes interpretability and longevity, showing that analytic, physics-informed expressions can rival neural networks for current and future cosmological analyses, with clear paths for extending to non-linear regimes and additional physics.

Abstract

Computing the matter power spectrum, , as a function of cosmological parameters can be prohibitively slow in cosmological analyses, hence emulating this calculation is desirable. Previous analytic approximations are insufficiently accurate for modern applications, so black-box, uninterpretable emulators are often used. We utilise an efficient genetic programming based symbolic regression framework to explore the space of potential mathematical expressions which can approximate the power spectrum and . We learn the ratio between an existing low-accuracy fitting function for and that obtained by solving the Boltzmann equations and thus still incorporate the physics which motivated this earlier approximation. We obtain an analytic approximation to the linear power spectrum with a root mean squared fractional error of 0.2% between and across a wide range of cosmological parameters, and we provide physical interpretations for various terms in the expression. Our analytic approximation is 950 times faster to evaluate than camb and 36 times faster than the neural network based matter power spectrum emulator BACCO. We also provide a simple analytic approximation for with a similar accuracy, with a root mean squared fractional error of just 0.1% when evaluated across the same range of cosmologies. This function is easily invertible to obtain as a function of and the other cosmological parameters, if preferred. It is possible to obtain symbolic approximations to a seemingly complex function at a precision required for current and future cosmological analyses without resorting to deep-learning techniques, thus avoiding their black-box nature and large number of parameters. Our emulator will be usable long after the codes on which numerical approximations are built become outdated.
Paper Structure (9 sections, 7 equations, 5 figures, 3 tables)

This paper contains 9 sections, 7 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: Pareto front of solutions obtained using operon when fitting $\sigma_8 / \sqrt{10^9 A_{\rm s}}$ as a function of $\Omega_{\rm b}$, $\Omega_{\rm m}$, $h$ and $n_{\rm s}$. We plot the root mean squared error as a function of model length from the training and validation sets separately. The model in \ref{['eq:sigma8_fit']} has a model length of 27.
  • Figure 2: Linear matter power spectrum (upper), the residuals (\ref{['eq:Pk_residual_definition']}) from the Eisenstein_1998 fit without baryons (middle), and the fractional residuals on $P(k)$ compared to the truth for the Planck 2018 Planck_VI_2018 cosmology. In all panels we plot the truth computed with camb with solid red lines, and the analytic fit (\ref{['eq:pk_lin_fit']}) obtained in this paper with dashed blue lines. We see that the fit is accurate within 0.3% across all $k$ considered.
  • Figure 3: Pareto front of solutions obtained using operon when fitting the linear matter power spectrum as a function of $\sigma_8$, $\Omega_{\rm b}$, $\Omega_{\rm m}$, $h$ and $n_{\rm s}$. We plot the root mean squared error on $\log F$ as a function of model length for the training and validation sets separately. The model given in \ref{['eq:pk_lin_fit']} has a model length of 77, as indicated by the dotted line.
  • Figure 4: Distribution of fractional errors as a function of $k$ on the linear matter power spectrum across all cosmologies in the training and validation sets, as compared to the predictions of camb. The bands give the 1 and $2\sigma$ values. The dotted line corresponds to a 1% error, and we see that our expression achieves this for all cosmologies and values of $k$ considered, with a root mean squared fractional error of 0.2%.
  • Figure 5: Contributions to $\log F$ from our emulator as a function of $k$ for the Planck 2018 cosmology. The line numbers indicated in the legend correspond to the line in \ref{['eq:pk_lin_fit']}. One sees that the first term provides an overall offset, the second and fourth capture the BAO signal, and the third term contains a broad oscillation and then matches on to the decaying residual at high $k$.