Table of Contents
Fetching ...

Efficient Bayesian analysis of kilonovae and gamma ray burst afterglows with fiesta

Hauke Koehn, Thibeau Wouters, Peter T. H. Pang, Mattia Bulla, Henrik Rose, Hannah Wichern, Tim Dietrich

TL;DR

This work addresses the computational bottleneck of Bayesian inference for kilonova and GRB afterglow emission by introducing fiesta, a JAX-based framework that trains ML surrogates for the GRB afterglow models (afterglowpy, pyblastafterglow) and a kilonova model (possis). Surrogates predict the spectral flux density $F_\nu$ across time and frequency, enabling likelihood evaluations to be performed efficiently and permitting posterior sampling with flowMC on GPUs. The authors validate the approach with injection tests and show substantial speedups (minutes vs hours) while reproducing posteriors consistent with base models, applying the method to AT2017gfo/GRB170817A and GRB211211A and, for the first time, performing a Bayesian inference with pyblastafterglow. Fiesta thus enables rapid, high-dimensional, multi-messenger parameter estimation and sets the stage for joint GW–EM analyses and broader deployment to future transient surveys.

Abstract

Gamma-ray burst (GRB) afterglows and kilonovae (KNe) are electromagnetic transients that can accompany binary neutron star (BNS) mergers. Therefore, studying their emission processes is of general interest for constraining cosmological parameters or the behavior of ultra-dense matter. One common method to analyze electromagnetic data from BNS mergers is to sample a Bayesian posterior over the parameters of a physical model for the transient. However, sampling the posterior is computationally costly and because of the many likelihood evaluations required in this process, detailed models are too expensive to be used directly in Bayesian inference. In this paper, we address the problem by introducing fiesta, a python package to train machine learning (ML) surrogates for GRB afterglow and kilonova models that have the capacity to accelerate likelihood evaluations. Specifically, we introduce extensive ML surrogates for the state-of-the-art GRB afterglow models afterglowpy and pyblastafterglow, along with a new surrogate for KN emission based on the possis code. Our surrogates enable evaluation of the light-curve posterior within minutes. We also provide built-in posterior sampling capabilities in fiesta that rely on the flowMC package, which efficiently scale to higher dimensions when adding up to tens of nuisance sampling parameters. Because of its use of the JAX framework, fiesta also allows for GPU acceleration during both surrogate training and posterior sampling. We applied our framework to reanalyze AT2017gfo/GRB170817A and GRB211211A with our surrogates, thus employing the new pyblastafterglow model for the first time in Bayesian inference.

Efficient Bayesian analysis of kilonovae and gamma ray burst afterglows with fiesta

TL;DR

This work addresses the computational bottleneck of Bayesian inference for kilonova and GRB afterglow emission by introducing fiesta, a JAX-based framework that trains ML surrogates for the GRB afterglow models (afterglowpy, pyblastafterglow) and a kilonova model (possis). Surrogates predict the spectral flux density across time and frequency, enabling likelihood evaluations to be performed efficiently and permitting posterior sampling with flowMC on GPUs. The authors validate the approach with injection tests and show substantial speedups (minutes vs hours) while reproducing posteriors consistent with base models, applying the method to AT2017gfo/GRB170817A and GRB211211A and, for the first time, performing a Bayesian inference with pyblastafterglow. Fiesta thus enables rapid, high-dimensional, multi-messenger parameter estimation and sets the stage for joint GW–EM analyses and broader deployment to future transient surveys.

Abstract

Gamma-ray burst (GRB) afterglows and kilonovae (KNe) are electromagnetic transients that can accompany binary neutron star (BNS) mergers. Therefore, studying their emission processes is of general interest for constraining cosmological parameters or the behavior of ultra-dense matter. One common method to analyze electromagnetic data from BNS mergers is to sample a Bayesian posterior over the parameters of a physical model for the transient. However, sampling the posterior is computationally costly and because of the many likelihood evaluations required in this process, detailed models are too expensive to be used directly in Bayesian inference. In this paper, we address the problem by introducing fiesta, a python package to train machine learning (ML) surrogates for GRB afterglow and kilonova models that have the capacity to accelerate likelihood evaluations. Specifically, we introduce extensive ML surrogates for the state-of-the-art GRB afterglow models afterglowpy and pyblastafterglow, along with a new surrogate for KN emission based on the possis code. Our surrogates enable evaluation of the light-curve posterior within minutes. We also provide built-in posterior sampling capabilities in fiesta that rely on the flowMC package, which efficiently scale to higher dimensions when adding up to tens of nuisance sampling parameters. Because of its use of the JAX framework, fiesta also allows for GPU acceleration during both surrogate training and posterior sampling. We applied our framework to reanalyze AT2017gfo/GRB170817A and GRB211211A with our surrogates, thus employing the new pyblastafterglow model for the first time in Bayesian inference.

Paper Structure

This paper contains 15 sections, 12 equations, 15 figures, 1 table.

Figures (15)

  • Figure 1: Benchmarks of the two surrogates for the afterglowpy Gaussian jet model. We show the error distributions of the surrogate predictions against a test data set of size $n_{\text{test}}=7500$. The different rows show the error across different passbands. The left panels show the distribution of the mean squared error as defined in Eq. \ref{['eq:mse_error']}. The right panels show the mismatch distribution across the test data set as defined in Eq. \ref{['eq:mis_error']}. The figure compares two different surrogates: one using the MLP architecture (blue) and the other a cVAE (green).
  • Figure 2: Benchmarks of the two surrogates for the pyblastafterglow Gaussian jet model. We show the deviations of surrogate predictions against a test data set of size $n_{\text{test}}=7232$. Figure layout is the same as in Fig. \ref{['fig:benchmark_afgpy_gaussian']}.
  • Figure 3: Benchmarks of two surrogates for the KN possis model. We show the deviations of surrogate predictions against a test data set of size $n_{\text{test}}=2238$. Figure layout as in Fig. \ref{['fig:benchmark_afgpy_gaussian']}. The figure compares two different surrogates: one using the MLP architecture (blue) and the other a LightcurveModel, where an MLP is trained for each passband separately (green).
  • Figure 4: Parameter recovery for an injected mock light curve from the gaussian afterglowpy jet model. The corner plot shows the posterior contours at 68% and 95% credibility. Parameters correspond to the symbols in Table \ref{['tab:surrogate_models']}, $\sigma_{\text{sys}}$ is the freely sampled systematic uncertainty. Different colors compare posteriors obtained with different sampling methods. The posterior in red is based on likelihood evaluations from the proper afterglowpy model with the nmma sampler. The purple posterior relies on the fiesta surrogate for the likelihood evaluation but uses the nmma sampler. The light blue posterior uses the fiesta surrogate as well but is sampled within fiesta's own inference framework that relies on flowMC. The injection parameters used to generate the mock light curve data are indicated by the orange lines. The insets on the upper right side show the injection data across the photometric filters and the best-fit light curve (i.e., highest likelihood) of the fiesta posterior (lightblue) and the actual afterglowpy light curve used to generate the mock data (red). The latter lies almost completely underneath the former.
  • Figure 5: P-P plots for GRB afterglow injections. Each panel shows a P-P plot for the recovery of the parameter displayed in its top left corner. The P-P plots show the cumulative distribution of the injected values' posterior quantiles for 200 injections. The lightblue curves indicate injection recoveries with afterglowpy, the magenta ones for pyblastafterglow. The solid lines signify that the injections stem from physical base model, the dashed lines indicate an injection with the surrogate itself. The gray areas mark the 68%-95%-99.7% confidence range in which the quantile distribution should fall if it was uniformly distributed.
  • ...and 10 more figures