Sampling the full hierarchical population posterior distribution in gravitational-wave astronomy
Michele Mancarella, Davide Gerosa
TL;DR
The paper addresses the challenge of GW population inference by performing full hierarchical sampling of the population hyperparameters and all event parameters, yielding a high-dimensional posterior with about $500$ dimensions for the GWTC-3 catalog. It employs probabilistic programming (e.g., PyMC) and Hamiltonian Monte Carlo, using a continuous Gaussian Mixture interpolation of transformed single-event posteriors to enable joint sampling of $oldsymbol{ ho}$ and $ heta_i$ while accounting for selection via the function $\xi(oldsymbol{ ho})$. The authors demonstrate that this approach recovers standard population constraints, provides population-informed event posteriors, and reveals direct event-by-event contributions to population features through correlation metrics $ ho_{ijk}$ and $oldsymbol{oldsymbol{}}_{ik}$, applied to 69 BH events. This method improves accuracy by removing certain Monte Carlo integrations, offers richer physical insights, and is scalable to larger catalogs and future cosmological extensions, with open-source tooling and potential for emulator-based selection functions and dark-siren cosmology.
Abstract
We present a full sampling of the hierarchical population posterior distribution of merging black holes using current gravitational-wave data. We directly tackle the most relevant intrinsic parameter space made of the binary parameters (masses, spin magnitudes, spin directions, redshift) of all the events entering the GWTC-3 LIGO/Virgo/KAGRA catalog, as well as the hyperparameters of the underlying population of sources. This results in a parameter space of about 500 dimensions, in contrast with current investigations where the targeted dimensionality is drastically reduced by marginalizing over all single-event parameters. In particular, we have direct access to (i) population parameters, (ii) population-informed single-event parameters, and (iii) correlations between these two sets of parameters. We quantify the fractional contribution of each event to the constraints on the population hyperparameters. Our implementation relies on modern probabilistic programming languages and Hamiltonian Monte Carlo, with a continuous interpolation of single-event posterior probabilities. Sampling the full hierarchical problem is feasible, as demonstrated here, and advantageous as it removes some (but not all) of the Monte Carlo integrations that enter the likelihood together with the related variances.
