OLÉ -- Online Learning Emulation in Cosmology
Sven Günther, Lennart Balkenhol, Christian Fidler, Ali Rida Khalife, Julien Lesgourgues, Markus R. Mosbech, Ravi Kumar Sharma
TL;DR
OLÉ tackles the cost of cosmological inference by introducing an online learning emulator that trains during inference using PCA for data compression and Gaussian Processes for fast predictions, with automatic accuracy checks to trigger retraining as needed. The approach yields speed-ups of up to $30-350\times$ while preserving accuracy relative to full Boltzmann-codes, and it supports differentiable likelihoods to gain an additional $\sim 4\times$ improvement via gradient-based sampling. OLÉ demonstrates its effectiveness across ΛCDM and extended cosmologies, including Stage-IV LSS forecasts and NEDE scenarios, interfacing smoothly with CLASS, CAMB, Cobaya, and MontePython. The combination of no pre-training, on-the-fly data acquisition, and robust uncertainty quantification enables scalable, energy-efficient cosmological analyses on complex data sets, with open-source availability at the OLÉ GitHub repository. The work further shows that differentiable pipelines combining OLÉ with candl can yield substantial gains in sampling efficiency, making high-precision cosmology more practical on large parameter spaces.
Abstract
In this work, we present OLÉ, a new online learning emulator for use in cosmological inference. The emulator relies on Gaussian Processes and Principal Component Analysis for efficient data compression and fast evaluation. Moreover, OLÉ features an automatic error estimation for optimal active sampling and online learning. All training data is computed on-the-fly, making the emulator applicable to any cosmological model or dataset. We illustrate the emulator's performance on an array of cosmological models and data sets, showing significant improvements in efficiency over similar emulators without degrading accuracy compared to standard theory codes. We find that OLÉ is able to considerably speed up the inference process, increasing the efficiency by a factor of $30-350$, including data acquisition and training. Typically the runtime of the likelihood code becomes the computational bottleneck. Furthermore, OLÉ emulators are differentiable; we demonstrate that, together with the differentiable likelihoods available in the $\texttt{candl}$ library, we can construct a gradient-based sampling method which yields an additional improvement factor of 4. OLÉ can be easily interfaced with the popular samplers $\texttt{MontePython}$ and $\texttt{Cobaya}$, and the Einstein-Boltzmann solvers $\texttt{CLASS}$ and $\texttt{CAMB}$. OLÉ is publicly available at https://github.com/svenguenther/OLE .
