Table of Contents
Fetching ...

$\mathtt{emuflow}$: Normalising Flows for Joint Cosmological Analysis

Arrykrishna Mootoovaloo, Carlos García-García, David Alonso, Jaime Ruiz-Zapatero

TL;DR

This work constructs normalising flow models for a set of public cosmological datasets of general interests and makes them available, together with the software used to train them, and to exploit them in cosmological parameter inference.

Abstract

Given the growth in the variety and precision of astronomical datasets of interest for cosmology, the best cosmological constraints are invariably obtained by combining data from different experiments. At the likelihood level, one complication in doing so is the need to marginalise over large-dimensional parameter models describing the data of each experiment. These include both the relatively small number of cosmological parameters of interest and a large number of "nuisance" parameters. Sampling over the joint parameter space for multiple experiments can thus become a very computationally expensive operation. This can be significantly simplified if one could sample directly from the marginal cosmological posterior distribution of preceding experiments, depending only on the common set of cosmological parameters. In this paper, we show that this can be achieved by emulating marginal posterior distributions via normalising flows. The resulting trained normalising flow models can be used to efficiently combine cosmological constraints from independent datasets without increasing the dimensionality of the parameter space under study. We show that the method is able to accurately describe the posterior distribution of real cosmological datasets, as well as the joint distribution of different datasets, even when significant tension exists between experiments. The resulting joint constraints can be obtained in a fraction of the time it would take to combine the same datasets at the level of their likelihoods. We construct normalising flow models for a set of public cosmological datasets of general interests and make them available, together with the software used to train them, and to exploit them in cosmological parameter inference.

$\mathtt{emuflow}$: Normalising Flows for Joint Cosmological Analysis

TL;DR

This work constructs normalising flow models for a set of public cosmological datasets of general interests and makes them available, together with the software used to train them, and to exploit them in cosmological parameter inference.

Abstract

Given the growth in the variety and precision of astronomical datasets of interest for cosmology, the best cosmological constraints are invariably obtained by combining data from different experiments. At the likelihood level, one complication in doing so is the need to marginalise over large-dimensional parameter models describing the data of each experiment. These include both the relatively small number of cosmological parameters of interest and a large number of "nuisance" parameters. Sampling over the joint parameter space for multiple experiments can thus become a very computationally expensive operation. This can be significantly simplified if one could sample directly from the marginal cosmological posterior distribution of preceding experiments, depending only on the common set of cosmological parameters. In this paper, we show that this can be achieved by emulating marginal posterior distributions via normalising flows. The resulting trained normalising flow models can be used to efficiently combine cosmological constraints from independent datasets without increasing the dimensionality of the parameter space under study. We show that the method is able to accurately describe the posterior distribution of real cosmological datasets, as well as the joint distribution of different datasets, even when significant tension exists between experiments. The resulting joint constraints can be obtained in a fraction of the time it would take to combine the same datasets at the level of their likelihoods. We construct normalising flow models for a set of public cosmological datasets of general interests and make them available, together with the software used to train them, and to exploit them in cosmological parameter inference.
Paper Structure (27 sections, 37 equations, 7 figures, 1 table)

This paper contains 27 sections, 37 equations, 7 figures, 1 table.

Figures (7)

  • Figure 1: Directed Acyclic Graphs (DAGs) showing the typical inference problem in cosmology in Panel (a). Panel (b) shows the DAG for a joint analysis in the case where the forward model in experiment 1 also has nuisance parameters, $\boldsymbol{\beta}$ and for experiment 2, we have access to an approximate distribution, $p(\boldsymbol{\theta}|\boldsymbol{x}_{2})$. In Panel (c), we have marginalised over all the nuisance parameters and we have approximate $p(\boldsymbol{\theta}|\boldsymbol{x}_{1})$ and $p(\boldsymbol{\theta}|\boldsymbol{x}_{2]})$. Note that we are working with independent datasets, hence there is not link between any two datasets, $\boldsymbol{x}_{1}$ and $\boldsymbol{x}_{2}$.
  • Figure 2: The plot shows the training samples in blue. They are generated using a mixture of three normal distributions, with means, $[-1.0, 0.5, 0.0]$ and standard deviations $[0.25, 0.50, 0.10]$. Therefore, $p(\theta)=\sum_{i=1}^{3}w_{i}\pazocal{N}(\mu_{i}, \sigma_{i})$, where $w_{i}=\frac{1}{3}$ is fixed. The probability distribution learned by the normalising flow model is shown in violet, while the dashed black curve shows the known distribution. The flow model accurately captures the distribution of the generated samples. See explanation in §\ref{['sec:1d-flow']} for further details on the implementation.
  • Figure 3: The figure shows the joint posterior distribution of a Gaussian posterior, obtained from a Gaussian Linear Model and a banana posterior. See §\ref{['sec:glm_banana']} for implementation details. The blue and orange colors show the posteriors of the two parameters $\theta_{0}$ and $\theta_{1}$, sampled using MCMC, for the Gaussian Linear Model and the banana respectively. The normalising flows are built using these samples and are shown in green and red respectively. The purple shaded region shows the joint distribution using the individual likelihoods, while the brown contour shows the joint distribution using only the normalising flow models.
  • Figure 4: Figure showing the joint posterior of the cosmological parameters only(marginalised over the nuisance parameters) from the P18 dataset. The green contours correspond to the samples obtained using the normalising flow model and the black contours are the original samples.
  • Figure 5: Panel (a) shows the joint posterior of the cosmological parameters where the P18 normalising flow is used as a prior in the analysis. Panel (b) shows the posterior in the case where the posterior is sampled using the local posterior due to the CGG21 and P18 datasets, where each density is learnt by the normalising flow model. In both plots, the green contours correspond to the posterior due to the normalising flow models, whereas the black contours are the known posterior as obtained by 2021JCAP...10..030G.
  • ...and 2 more figures