Table of Contents
Fetching ...

Sym-EFT: Accelerating Effective Field Theory of Large Scale Structure with Symbolic Regression

Despoina Farakou, Constantinos Skordis

TL;DR

This work develops Sym-EFT, an emulator suite that accelerates the Effective Field Theory of Large Scale Structure by emulating the one- and two-loop contributions with explicit, differentiable functional forms obtained via symbolic regression. By separating the time-dependent counterterms from the emulation, the approach remains flexible to various counterterm parametrizations while achieving better than 0.5% accuracy within the EFT validity range and evaluating in sub-millisecond times per model. The emulators are trained on CosmoEFT-Class/ResumEFT data across hundreds of cosmologies, and they demonstrate substantial speedups (up to ~10^6–10^7) over traditional EFT codes, with robust cross-code validation against PyBird, CLASS-PT, and CosmoEFT. The framework supports integration with Boltzmann solvers like CLASS and has strong potential for expanding to CMB lensing and high-precision cosmology analyses, including future sub-0.1% accuracy goals with continued refinement.

Abstract

We present an emulator suite for the one- and two-loop cold dark matter power spectrum from the Effective Field Theory of Large Scale Structures (EFTofLSS). Specifically, we emulate separately the various contributions to the one- and two-loop parts of the power spectrum, leaving out the possible counterterms which can be added as multiplicative prefactors. By leaving the time-dependence of the counterterms unspecified at the emulation stage, our technique has the advantage of being extremely versatile in fitting any type of counterterm parametrisation to data, or to simulations, without having to change the emulator. We construct our emulators using the method of symbolic regression which results in functions that can be used directly in computer code, while achieving errors of better than $0.5\%$ within the $k$-range of validity of EFT and maintaining ultra-fast computational evaluation of less than $\sim5\times10^{-4}s$ on a single core.

Sym-EFT: Accelerating Effective Field Theory of Large Scale Structure with Symbolic Regression

TL;DR

This work develops Sym-EFT, an emulator suite that accelerates the Effective Field Theory of Large Scale Structure by emulating the one- and two-loop contributions with explicit, differentiable functional forms obtained via symbolic regression. By separating the time-dependent counterterms from the emulation, the approach remains flexible to various counterterm parametrizations while achieving better than 0.5% accuracy within the EFT validity range and evaluating in sub-millisecond times per model. The emulators are trained on CosmoEFT-Class/ResumEFT data across hundreds of cosmologies, and they demonstrate substantial speedups (up to ~10^6–10^7) over traditional EFT codes, with robust cross-code validation against PyBird, CLASS-PT, and CosmoEFT. The framework supports integration with Boltzmann solvers like CLASS and has strong potential for expanding to CMB lensing and high-precision cosmology analyses, including future sub-0.1% accuracy goals with continued refinement.

Abstract

We present an emulator suite for the one- and two-loop cold dark matter power spectrum from the Effective Field Theory of Large Scale Structures (EFTofLSS). Specifically, we emulate separately the various contributions to the one- and two-loop parts of the power spectrum, leaving out the possible counterterms which can be added as multiplicative prefactors. By leaving the time-dependence of the counterterms unspecified at the emulation stage, our technique has the advantage of being extremely versatile in fitting any type of counterterm parametrisation to data, or to simulations, without having to change the emulator. We construct our emulators using the method of symbolic regression which results in functions that can be used directly in computer code, while achieving errors of better than within the -range of validity of EFT and maintaining ultra-fast computational evaluation of less than on a single core.

Paper Structure

This paper contains 33 sections, 38 equations, 17 figures, 4 tables.

Figures (17)

  • Figure 1: The different contributions to the matter power spectrum. Shown is the linear power spectrum (dotted, blue), 1-loop SPT (dashed, green), 2-loop SPT (dash-dot, red) and the full nonlinear power spectrum from the Syren-New emulator (solid, orange).
  • Figure 2: Top:$P^{\rm SPT}_{\rm 1-loop}(k)$$\%$-relative difference of CosmoEFT (dash-dot, blue), CosmoEFT-Class (solid, orange) and CLASS-PT (dotted, purple) compared respectively with Pybird (baseline, dashed black). Middle: IR-resummed $P^{\rm SPT}_{\rm 1-loop}(k)$$\%$-relative difference of CosmoEFT (dash-dot, blue), CosmoEFT-Class (solid, orange) and CLASS-PT (dotted, purple) compared with Pybird(baseline). Bottom: Comparison between the $2$-loop SPT power spectrum $P^{\rm SPT}_{\rm 2-loop}(k)$ with and without IR resummation. Shown is the $\%$-relative difference between the baseline model (IR-resummed CosmoEFT) and CosmoEFT prior to resummation (dash-dot, blue), and with CosmoEFT-Class(solid, orange) with resummation.
  • Figure 3: Training Data for 200 cosmologies. Each line corresponds to a specific set of cosmological parameters, sampled with a Latin hypercube within the bounds in Table. \ref{['tab:cosmological_parameters']}
  • Figure 4: Left: The Pareto front of RMSE vs model length for the $\frac{ {[k^2 P_{11}]_{\|0}}}{k^2 P_{11} }$ emulator runs as generated by Operon, with blue marking the training and red the validation error, and with the chosen model of length 50 indicated by the vertical dashed line. Right: The top plot shows the ${[k^2 P_{11}]_{\|0}}$ function for two extreme cases of cosmological parameters while the bottom plot displays the resulting $1\sigma$ and $2\sigma$ emulator $\%$ error for all 300 cosmologies. The horizontal dashed lines mark the $0.5\%$ threshold.
  • Figure 5: Left: The ${[P_{\rm 1-loop}]_{\|0}}$ power spectrum for one set of cosmological parameters, split in three overlaping regions whose boundaries are marked with vertical lines. We plot each emulated function within its region with a solid line, and outside its region of validity with a dashed line. Right: The ${[P_{\rm 2-loop}]_{\|0}}$ power spectrum, split in two overlaping regions, whose boundaries are marked with vertical lines. We use the same plotting conventions as the left panel.
  • ...and 12 more figures