Table of Contents
Fetching ...

How to embed any likelihood into SBI: Application to Planck + Stage IV galaxy surveys and Dynamical Dark Energy

Guillermo Franco Abellán, Noemi Anau Montel, Oleg Savchenko, Christoph Weniger

Abstract

Simulation-based inference (SBI) allows fast Bayesian inference for simulators encoding implicit likelihoods. However, some explicit likelihoods cannot be easily reformulated as simulators, hindering their integration into combined analyses within SBI frameworks. One key example in cosmology is given by the Planck CMB likelihoods. We present a simple method to construct an effective simulator for any explicit likelihood using samples from a previously converged Markov Chain Monte Carlo (MCMC) run. This effective simulator can subsequently be combined with any forward simulator. To illustrate this method, we combine the full Planck CMB likelihoods with a 3x2pt simulator (cosmic shear, galaxy clustering and their cross-correlation) for a Stage IV survey like Euclid, and test evolving dark energy parameterized by the $w_0w_a$ equation-of-state. Assuming the $w_0w_a$CDM cosmology hinted by DESI BAO DR2 + Planck 2018 + PantheonPlus SNIa datasets, we find that future 3x2pt data alone could detect evolving dark energy at $5σ$, while its combination with current CMB, BAO and SNIa datasets could raise the detection to almost $7σ$. Moreover, thanks to simulation reuse enabled by SBI, we show that our joint analysis is in excellent agreement with MCMC while requiring zero Boltzmann solver calls. This result opens up the possibility of performing massive global scans combining explicit and implicit likelihoods in a highly efficient way.

How to embed any likelihood into SBI: Application to Planck + Stage IV galaxy surveys and Dynamical Dark Energy

Abstract

Simulation-based inference (SBI) allows fast Bayesian inference for simulators encoding implicit likelihoods. However, some explicit likelihoods cannot be easily reformulated as simulators, hindering their integration into combined analyses within SBI frameworks. One key example in cosmology is given by the Planck CMB likelihoods. We present a simple method to construct an effective simulator for any explicit likelihood using samples from a previously converged Markov Chain Monte Carlo (MCMC) run. This effective simulator can subsequently be combined with any forward simulator. To illustrate this method, we combine the full Planck CMB likelihoods with a 3x2pt simulator (cosmic shear, galaxy clustering and their cross-correlation) for a Stage IV survey like Euclid, and test evolving dark energy parameterized by the equation-of-state. Assuming the CDM cosmology hinted by DESI BAO DR2 + Planck 2018 + PantheonPlus SNIa datasets, we find that future 3x2pt data alone could detect evolving dark energy at , while its combination with current CMB, BAO and SNIa datasets could raise the detection to almost . Moreover, thanks to simulation reuse enabled by SBI, we show that our joint analysis is in excellent agreement with MCMC while requiring zero Boltzmann solver calls. This result opens up the possibility of performing massive global scans combining explicit and implicit likelihoods in a highly efficient way.

Paper Structure

This paper contains 20 sections, 16 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Summary illustration of the auxiliary-observable construction for a toy 1-dimensional problem. Left panel: Definition of the auxiliary-observable $a$ according to \ref{['eq:a_definition']} and its relationship with $\theta$. Right panel: Multiple conditional distributions $p(\theta\mid a)$ for different values of the auxiliary-observable $a$, with the true posterior $p(\theta\mid x_0)$ (black solid line) recovered for $a = 0$ (red dashed line). This shows how the framework preserves the original likelihood information at $a = 0$.
  • Figure 2: 1- and 2-dimensional marginalized posterior distributions (68% and 95% C.L.) of the $\Lambda$CDM cosmological parameters for different data combinations, using both MCMC (dashed lines) and SBI (solid lines). The black dotted lines indicate the $\Lambda$CDM best-fit to Planck 2018, which is our 'Fiducial I' model used to generate the mock 3$\times$2pt observation for a Stage IV photometric survey. SBI posteriors are in excellent agreement with MCMC.
  • Figure 3: Number of required model evaluations across all cosmological scenarios, datasets and inference methods considered in this work. For MCMC, this reflects the number of likelihood evaluations needed for convergence, whereas for SBI, the meaning is different for each simulator. On the one hand, our effective simulators of Planck 2018 and Baseline (= Planck 2018 + DESI BAO DR2 + PantheonPlus) datasets use $10^5$ posterior samples that can be promptly generated from the corresponding pre-converged MCMC runs, and hence necessitate the same number of evaluations as MCMC. On the other hand, our forward 3$\times$2pt simulators use only $5\times 10^4$ data realizations, a factor $\sim 6$ smaller than those needed for MCMC. Finally, for the SBI combined analyses, we reused the samples from the Planck/Baseline and 3$\times$2pt runs, hence requiring zero new model evaluations. All SBI runs additionally involve training the inference networks, which takes just $5-20$min on a single GPU.
  • Figure 4: 2-dimensional marginalized posterior (68% and 95% C.L.) of $w_a$ and $w_0$ for different data combinations, using both MCMC (dashed lines) and SBI (solid lines). The star indicates the $w_0 w_a$CDM best-fit to our Baseline dataset (Planck 2018 + DESI BAO DR2 + PantheonPlus), which is our 'Fiducial II' model used to generate the mock Stage IV 3$\times$2pt observation. The black dotted lines indicate $w_0 = −1$ and $w_a = 0$; the $\Lambda$CDM limit lies at their intersection. The significance of rejection of $\Lambda$CDM is $3.3\sigma$, $5.0\sigma$ and $6.8\sigma$ for Baseline, 3$\times$2pt and their combination, respectively.
  • Figure 5: 1- and 2-dimensional marginalized posterior distributions (68% and 95% C.L.) of the $\Lambda$CDM cosmological parameters for the combination of Planck and mock 3$\times$2pt data for a Stage IV photometric survey. These were obtained using both the MCMC sampler MontePython-v3 with the full likelihoods (dashed blue lines) and the nested sampler Nautilus with the emulated likelihoods (solid purple lines). The posteriors from the emulated likelihoods are in excellent agreement with MCMC.
  • ...and 1 more figures