Table of Contents
Fetching ...

Bayesian evidence estimation from posterior samples with normalizing flows

Rahul Srinivasan, Marco Crisostomi, Roberto Trotta, Enrico Barausse, Matteo Breschi

TL;DR

The paper tackles the challenge of estimating the Bayesian evidence $Z$ from samples drawn from the unnormalized posterior $\hat{p}(\mathbf{x})$. It introduces floZ, which trains a masked autoregressive flow to map the target distribution to a simple base density, allowing $Z$ to be recovered from density ratios and Jacobians via $\zeta(\mathbf{x},\phi)=\hat{p}(\mathbf{x})/q_{\phi}(\mathbf{x})$ and $Z=\frac{\hat{p}(f_{\phi}(\mathbf{y}))}{n(\mathbf{y})}\,|\det\frac{\partial f_{\phi}}{\partial \mathbf{y}}|$. The method includes novel loss terms $\mathcal{L}_2$ and $\mathcal{L}_3$ to stabilize $Z$ estimates, a boundary-reflection technique to handle sharp posterior edges, and a latent-space sampling strategy to improve robustness. Validation on tractable benchmarks up to $d=15$ and intractable cases (Rosenbrock) shows floZ performing comparably to nested sampling and generally outperforming $k$-NN as dimensionality grows; a high-dimensional Gaussian test demonstrates scalability up to $d=200$. A real-data application to GW150914 demonstrates consistent Bayes factors with nested sampling while delivering substantial runtime advantages, underscoring floZ’s practical utility for Bayesian model comparison in physics and astronomy.

Abstract

We propose a novel method ($floZ$), based on normalizing flows, to estimate the Bayesian evidence (and its numerical uncertainty) from a pre-existing set of samples drawn from the unnormalized posterior distribution. We validate it on distributions whose evidence is known analytically, up to 15 parameter space dimensions, and compare with two state-of-the-art techniques for estimating the evidence: nested sampling (which computes the evidence as its main target) and a $k$-nearest-neighbors technique that produces evidence estimates from posterior samples. Provided representative samples from the target posterior are available, our method is more robust to posterior distributions with sharp features, especially in higher dimensions. For a simple multivariate Gaussian, we demonstrate its accuracy for up to 200 dimensions with $10^5$ posterior samples. $floZ$ has wide applicability, e.g., to estimate evidence from variational inference, Markov Chain Monte Carlo samples, or any other method that delivers samples and their likelihood from the unnormalized posterior density. As a physical application, we use $floZ$ to compute the Bayes factor for the presence of the first overtone in the ringdown signal of the gravitational wave data of GW150914, finding good agreement with nested sampling.

Bayesian evidence estimation from posterior samples with normalizing flows

TL;DR

The paper tackles the challenge of estimating the Bayesian evidence from samples drawn from the unnormalized posterior . It introduces floZ, which trains a masked autoregressive flow to map the target distribution to a simple base density, allowing to be recovered from density ratios and Jacobians via and . The method includes novel loss terms and to stabilize estimates, a boundary-reflection technique to handle sharp posterior edges, and a latent-space sampling strategy to improve robustness. Validation on tractable benchmarks up to and intractable cases (Rosenbrock) shows floZ performing comparably to nested sampling and generally outperforming -NN as dimensionality grows; a high-dimensional Gaussian test demonstrates scalability up to . A real-data application to GW150914 demonstrates consistent Bayes factors with nested sampling while delivering substantial runtime advantages, underscoring floZ’s practical utility for Bayesian model comparison in physics and astronomy.

Abstract

We propose a novel method (), based on normalizing flows, to estimate the Bayesian evidence (and its numerical uncertainty) from a pre-existing set of samples drawn from the unnormalized posterior distribution. We validate it on distributions whose evidence is known analytically, up to 15 parameter space dimensions, and compare with two state-of-the-art techniques for estimating the evidence: nested sampling (which computes the evidence as its main target) and a -nearest-neighbors technique that produces evidence estimates from posterior samples. Provided representative samples from the target posterior are available, our method is more robust to posterior distributions with sharp features, especially in higher dimensions. For a simple multivariate Gaussian, we demonstrate its accuracy for up to 200 dimensions with posterior samples. has wide applicability, e.g., to estimate evidence from variational inference, Markov Chain Monte Carlo samples, or any other method that delivers samples and their likelihood from the unnormalized posterior density. As a physical application, we use to compute the Bayes factor for the presence of the first overtone in the ringdown signal of the gravitational wave data of GW150914, finding good agreement with nested sampling.
Paper Structure (13 sections, 16 equations, 8 figures, 1 table)

This paper contains 13 sections, 16 equations, 8 figures, 1 table.

Figures (8)

  • Figure 1: Effect of high target density near prior boundaries on the normalizing flow prediction. The color plot compares the original, "sharp" boundary distribution (left) with the reflected distribution (right). The color bar represents the un-normalized posterior probability density. The grey lines highlight the boundary. The background black (left) and green (right) scatter points are the predicted distribution by the flow trained on the respective distributions.
  • Figure 2: Effect of sharp boundaries on the evidence estimation. The black (green) probability density shows the distribution of the log evidence prediction from the original (reflected) samples. The log evidence is re-scaled by the ground truth, so the true value is at 0 (dashed black line).
  • Figure 3: For each unnormalized posterior, we display $10^4$ posterior samples for the case $d=2$. For ease of comparison, the unnormalized posterior $\hat{\rm{p}}(\bf{x})$ is scaled by its maximum and shown in the common color bar. The shaded grey region represents the boundary of the rectangular prior.
  • Figure 4: Evidence estimation in $d=2$ dimensions for (clockwise from top-left): multivariate Gaussian; finite multivariate Gaussian mixture; Rosenbrock; Exponential. In all cases, floZ and $k$NN employ $10^4$ posterior samples. The true value, represented by the dashed line, has been rescaled to 0, and the shaded regions represent the 1-$\sigma$ uncertainty.
  • Figure 5: The evolution of the network's loss as a function of training epochs for the case of a 2-dimensional mixture of five Gaussians, illustrates the loss schedule. The four loss terms are shown in color, the total loss in thick/black, and the training (validation) losses are shown in solid (dotted) lines.
  • ...and 3 more figures