Table of Contents
Fetching ...

Accelerated nested sampling with posterior repartitioning and $β$-flows for gravitational waves

Metha Prathaban, Harry Bevins, Will Handley

TL;DR

This paper tackles the computational bottleneck of nested sampling in gravitational-wave inference by separating prior and likelihood contributions and learning a repartitioned prior with $β$-flows conditioned on inverse temperature $\beta$. A low-resolution NS pass seeds a trained $β$-flow that models $P(\theta|β)$, which is used as the repartitioned prior in a subsequent high-resolution NS run, with $β$ treated as a tunable hyperparameter to adapt the proposal. The authors show that this approach preserves the Bayesian evidence while dramatically reducing the number of likelihood evaluations, achieving speedups up to an order of magnitude in simulations and robust posteriors/evidences on real GW data where standard normalizing flows may fail. They also discuss practical considerations, such as the higher cost of evaluating $β$-flows and the robustness advantages of $β$-flows over traditional NFs, outlining future work to broaden applicability and further optimize performance.

Abstract

There is an ever-growing need in the gravitational wave community for fast and reliable inference methods, accompanied by an informative error bar. Nested sampling satisfies the last two requirements, but its computational cost can become prohibitive when using the most accurate waveform models. In this paper, we demonstrate the acceleration of nested sampling using a technique called posterior repartitioning. This method leverages nested sampling's unique ability to separate prior and likelihood contributions at the algorithmic level. Specifically, we define a `repartitioned prior' informed by the posterior from a low-resolution run. To construct this repartitioned prior, we use a $β$-flow, a novel type of conditional normalizing flow designed to better learn deep tail probabilities. $β$-flows are trained on the entire nested sampling run and conditioned on an inverse temperature $β$. Applying our methods to simulated and real binary black hole mergers, we demonstrate how they can reduce the number of likelihood evaluations required for a given evidence precision by up to an order of magnitude, enabling faster model comparison and parameter estimation. Furthermore, we highlight the robustness of using $β$-flows over standard normalizing flows for posterior repartitioning. Notably, $β$-flows are able to recover posteriors and evidences which are generally consistent with those from traditional nested sampling, even in cases where standard normalizing flows fail.

Accelerated nested sampling with posterior repartitioning and $β$-flows for gravitational waves

TL;DR

This paper tackles the computational bottleneck of nested sampling in gravitational-wave inference by separating prior and likelihood contributions and learning a repartitioned prior with -flows conditioned on inverse temperature . A low-resolution NS pass seeds a trained -flow that models , which is used as the repartitioned prior in a subsequent high-resolution NS run, with treated as a tunable hyperparameter to adapt the proposal. The authors show that this approach preserves the Bayesian evidence while dramatically reducing the number of likelihood evaluations, achieving speedups up to an order of magnitude in simulations and robust posteriors/evidences on real GW data where standard normalizing flows may fail. They also discuss practical considerations, such as the higher cost of evaluating -flows and the robustness advantages of -flows over traditional NFs, outlining future work to broaden applicability and further optimize performance.

Abstract

There is an ever-growing need in the gravitational wave community for fast and reliable inference methods, accompanied by an informative error bar. Nested sampling satisfies the last two requirements, but its computational cost can become prohibitive when using the most accurate waveform models. In this paper, we demonstrate the acceleration of nested sampling using a technique called posterior repartitioning. This method leverages nested sampling's unique ability to separate prior and likelihood contributions at the algorithmic level. Specifically, we define a `repartitioned prior' informed by the posterior from a low-resolution run. To construct this repartitioned prior, we use a -flow, a novel type of conditional normalizing flow designed to better learn deep tail probabilities. -flows are trained on the entire nested sampling run and conditioned on an inverse temperature . Applying our methods to simulated and real binary black hole mergers, we demonstrate how they can reduce the number of likelihood evaluations required for a given evidence precision by up to an order of magnitude, enabling faster model comparison and parameter estimation. Furthermore, we highlight the robustness of using -flows over standard normalizing flows for posterior repartitioning. Notably, -flows are able to recover posteriors and evidences which are generally consistent with those from traditional nested sampling, even in cases where standard normalizing flows fail.

Paper Structure

This paper contains 15 sections, 22 equations, 18 figures, 3 tables.

Figures (18)

  • Figure 1: Schematic of a nested sampling run. Each dead point defines an iso-likelihood contour in the parameter space (left), which then encloses a certain fractional prior volume (right). As the points compress towards the peak of the likelihood, they enclose smaller and smaller fractional volumes.
  • Figure 2: We evaluate the performance of normalizing flows on a mixture model, comprised of five Gaussians combined with unequal weights, as the number of dimensions increases. We generate samples from the mixture model in the full 14 dimensions using the package lsbilsbi_paperlsbi_github and drop the required number of columns to get samples in lower dimensions. We then train a normalizing flow using margarine on each set of samples, and compare the true log probability with the log probability predicted by the NF (blue). The black dashed line shows where the points would sit if the two perfectly matched. We also fit a five component Gaussian mixture model to each set of samples using lsbi and plot the log probability predictions of this too (orange). Since this model is in theory capable of fitting the distribution exactly, it could be taken to represent an upper bound on how well the task of density estimation can be performed in practice on this example. In lower dimensions, the NF performs well, albeit with slightly more scatter compared to the lsbi result. By $n=10$, however, the NF exhibits a significant decline in performance compared to the lsbi fit, with the most severe deterioration in the tails of the distribution. By $n=14$, both fits perform poorly. The arrows represent that there are points which lie outside the plot area. The full code to reproduce this plot, including details of how the mixture model was generated, can be found at zenodo.
  • Figure 3: Nested sampling can emulate any temperature. The posterior has an inverse temperature of $\beta=1$ and the prior has an inverse temperature of $\beta=0$. In-between temperatures represent intermediate distributions. This is illustrated first on a more straightforward case where the posterior is a Gaussian and the prior is uniform (top panel). As $\beta$ decreases from $1$ to $0$, the distribution widens. The bottom panel shows the two-dimensional $1\sigma$ contours recovered from a simulated binary black hole merger for the luminosity distance, $d_L$, and the zenith angle between the total angular momentum and the line of sight, $\theta_\mathrm{JN}$. The posterior samples are re-weighted according to equation \ref{['beta_weights']} to generate the distributions at various temperatures. Between $\beta=0.1$ and $\beta=0.2$, the distribution begins to split into two modes; in the statistical mechanics analogy, this is akin to a phase transition at the critical temperature.
  • Figure 4: The posteriors obtained on some intrinsic parameters (chirp mass $\mathcal{M}$, mass ratio $q$ and effective spin parameter $\chi_\mathrm{eff}$) from standard NS are compared to those obtained using PR with normalizing flows or $\beta$-flows. The results are consistent, showing both the PR methods have managed to recover the same answers as normal NS.
  • Figure 5: Similarly to \ref{['fig:intrinsic_simulated']}, the posteriors on the extrinsic parameters, the luminosity distance and inclination, from the two methods are compared. Again, the results are comparable, with the PR NS methods able to achieve this with far fewer likelihood evaluations. The $\beta$-flow method gives less posterior weight in the second mode and more posterior weight in the first mode than the normal NS run, but this could occur from two separate normal NS runs too, due to the stochasticity of NS AdamPolychord1. This stochasticity is quantified by the $\text{log}\mathcal{Z}$ error bars that PolyChord outputs for individual clusters.
  • ...and 13 more figures