Table of Contents
Fetching ...

Interpretable and physics-informed emulator for the linear matter power spectrum from machine learning

J. Bayron Orjuela-Quintana, Domenico Sapone, Savvas Nesseris

TL;DR

The results provide compact, accurate, and physically motivated fitting functions for the linear MPS in both standard and MG cosmologies, offering a fast and transparent alternative to existing emulators for parameter inference and theoretical modeling in large-scale structure analyses.

Abstract

We present an interpretable emulator for the linear matter power spectrum (MPS) in the standard cosmological model $Λ$CDM, constructed via a physics-informed symbolic regression framework. By combining domain knowledge with a machine learning technique known as genetic algorithms, we explore the space of analytic expressions to derive closed-form, smooth, physically motivated approximations of the MPS that match the accuracy of standard broadband reconstruction methodologies such as the Savitzky-Golay filter. Building upon this baseline, we incorporate transparent oscillatory corrections informed by the physics of baryon acoustic oscillations (BAO). The resulting expression delivers mean sub-percent fractional errors across a broad range of scales ($k \in [10^{-5}, 1.5]~h\,\mathrm{Mpc}^{-1}$) with an average deviation of $\sim 0.4\%$ when tested against spectra computed with a Boltzmann solver. Moreover, a comparable level of fractional deviation is maintained on smaller scales when the GA-derived formulation is used as input to the nonlinear emulator halofit. To illustrate the versatility of the framework beyond $Λ$CDM, we apply it to a representative $f(R)$ gravity model. Rather than training a general modified-gravity emulator, we compute the corresponding linear spectra with a Boltzmann solver and fit a parametric deformation of the $Λ$CDM smoothed component. This procedure achieves average errors at the 1.5-1.8\% level and captures the leading modulation of the MPS induced by modified gravity, enabling a controlled study of its impact on the BAO scale. Our results provide compact, accurate, and physically motivated fitting functions for the linear MPS in both standard and MG cosmologies, offering a fast and transparent alternative to existing emulators for parameter inference and theoretical modeling in large-scale structure analyses.

Interpretable and physics-informed emulator for the linear matter power spectrum from machine learning

TL;DR

The results provide compact, accurate, and physically motivated fitting functions for the linear MPS in both standard and MG cosmologies, offering a fast and transparent alternative to existing emulators for parameter inference and theoretical modeling in large-scale structure analyses.

Abstract

We present an interpretable emulator for the linear matter power spectrum (MPS) in the standard cosmological model CDM, constructed via a physics-informed symbolic regression framework. By combining domain knowledge with a machine learning technique known as genetic algorithms, we explore the space of analytic expressions to derive closed-form, smooth, physically motivated approximations of the MPS that match the accuracy of standard broadband reconstruction methodologies such as the Savitzky-Golay filter. Building upon this baseline, we incorporate transparent oscillatory corrections informed by the physics of baryon acoustic oscillations (BAO). The resulting expression delivers mean sub-percent fractional errors across a broad range of scales () with an average deviation of when tested against spectra computed with a Boltzmann solver. Moreover, a comparable level of fractional deviation is maintained on smaller scales when the GA-derived formulation is used as input to the nonlinear emulator halofit. To illustrate the versatility of the framework beyond CDM, we apply it to a representative gravity model. Rather than training a general modified-gravity emulator, we compute the corresponding linear spectra with a Boltzmann solver and fit a parametric deformation of the CDM smoothed component. This procedure achieves average errors at the 1.5-1.8\% level and captures the leading modulation of the MPS induced by modified gravity, enabling a controlled study of its impact on the BAO scale. Our results provide compact, accurate, and physically motivated fitting functions for the linear MPS in both standard and MG cosmologies, offering a fast and transparent alternative to existing emulators for parameter inference and theoretical modeling in large-scale structure analyses.
Paper Structure (37 sections, 122 equations, 15 figures, 13 tables)

This paper contains 37 sections, 122 equations, 15 figures, 13 tables.

Figures (15)

  • Figure 1: Left: Evolution of the fitness function across $10^4$ generations for 100 different GA runs initialized with different random seeds. Right: Fitness evolution for the best-performing run extended to $10^5$ generations. The plateau indicates stagnation in the optimization process.
  • Figure 2: Top:$\text{MAPE}$ as a function of $k$ for the reconstructed $P_{\text{GA}, {\text{nw}}}$ across 200 cosmologies sampled in a LH from Table. \ref{['Tab: Params']}. Thin gray lines represent individual models; the best and worst cases are highlighted in color. Our formula maintains better than 1% accuracy across the full range, except for $k \sim 0.01\!-\!0.3~h\,\text{Mpc}^{-1}$, where BAO features dominate. Bottom: Distribution of the fractional errors corresponding to the $1\sigma$ and $2\sigma$ regions. Dashed black lines represent the $1\%$ deviation region.
  • Figure 3: Fitness as a function of wavenumber $k$ over the training set. Left: Fitness without correction. Errors peak near $k \sim 0.02~h\,\text{Mpc}^{-1}$ (equality scale) and at $k \sim 0.2~h\,\text{Mpc}^{-1}$ (diffusion scale). The latter feature appears largely independent of the cosmological parameters. Right: Fitness after applying a Gaussian correction around $k \sim 0.2~h\,\text{Mpc}^{-1}$, improving the overall fit to $\text{MAPE}(\text{GA}) = 0.25\%$.
  • Figure 4: Point-wise fractional error $\text{MAPE}(k)$ of the emulated linear MPS. Top left: Original model $P_\text{GA}$ achieving $\text{MAPE}(\text{GA}) = 0.42\%$. Larger errors are observed around $k \sim 0.02~h\,\text{Mpc}^{-1}$ and $k \sim 0.1~h\,\text{Mpc}^{-1}$. Top right: Improved version including three localized Gaussian corrections reduces the mean fractional error to $\text{MAPE}(\text{GA}) = 0.39\%$, significantly improving the match around the Silk damping scale. Bottom panels show the distribution of the fractional errors corresponding to the $1\sigma$ and $2\sigma$ regions considering no corrections around the Silk scale (bottom left), and considering Gaussian corrections around this scale (bottom right).
  • Figure 5: Best-fit (green) and worst-fit (red) examples from the 200 test cosmologies. The worst case fails to match the location and amplitude of the peak at $k_\text{max}$, as indicated by the CLASS prediction (black dashed line).
  • ...and 10 more figures