Table of Contents
Fetching ...

Simulation-based population inference of LISA's Galactic binaries: Bypassing the global fit

Rahul Srinivasan, Enrico Barausse, Natalia Korsakova, Roberto Trotta

TL;DR

This work addresses the challenge of inferring Galactic DWD population properties from LISA data without performing a computationally intensive global fit or source-by-source parameter estimation. It introduces an amortized SBI pipeline that uses fast forward simulations of the DWD population, a bespoke data compression to ~4096 features, and a conditional normalizing flow to recover the posterior over population parameters from the full LISA data stream (A and E channels), including both resolvable and unresolved sources. The method is calibrated with posterior-temperature scaling to achieve approximately unbiased coverage, and a key finding is that resolvable sources carry the majority of information for the population parameters, with unresolvable data mainly aiding noise characterization. The approach is fast, scalable, and adaptable to additional source classes and non-Gaussian noise, enabling real-time population inference during the LISA mission and offering a complementary path to traditional global-fit analyses.

Abstract

The Laser Interferometer Space Antenna (LISA) is expected to detect thousands of individually resolved gravitational wave sources, overlapping in time and frequency, on top of unresolved astrophysical and/or primordial backgrounds. Disentangling resolved sources from backgrounds and extracting their parameters in a computationally intensive "global fit" is normally regarded as a necessary step toward reconstructing the properties of the underlying astrophysical populations. Here, we show that it is in principle feasible to infer the population properties of the most numerous of LISA sources -- Galactic double white dwarfs -- directly from the frequency (or, equivalently, time) strain series by adopting a simulation-based approach, without extracting and estimating the parameters of each single source. By training a normalizing flow on a custom-designed compression of simulated LISA frequency series from the Galactic double white dwarf population, we demonstrate how to infer the posterior distribution of population parameters (e.g., mass function, frequency, and spatial distributions). This allows for extracting information on the population parameters from both resolved and unresolved sources simultaneously and in a computationally efficient manner. This approach can be extended to other source classes (e.g., massive and stellar-mass black holes, extreme mass ratio inspirals) and to scenarios involving non-Gaussian or non-stationary noise (e.g., data gaps), provided that fast and accurate simulations are available.

Simulation-based population inference of LISA's Galactic binaries: Bypassing the global fit

TL;DR

This work addresses the challenge of inferring Galactic DWD population properties from LISA data without performing a computationally intensive global fit or source-by-source parameter estimation. It introduces an amortized SBI pipeline that uses fast forward simulations of the DWD population, a bespoke data compression to ~4096 features, and a conditional normalizing flow to recover the posterior over population parameters from the full LISA data stream (A and E channels), including both resolvable and unresolved sources. The method is calibrated with posterior-temperature scaling to achieve approximately unbiased coverage, and a key finding is that resolvable sources carry the majority of information for the population parameters, with unresolvable data mainly aiding noise characterization. The approach is fast, scalable, and adaptable to additional source classes and non-Gaussian noise, enabling real-time population inference during the LISA mission and offering a complementary path to traditional global-fit analyses.

Abstract

The Laser Interferometer Space Antenna (LISA) is expected to detect thousands of individually resolved gravitational wave sources, overlapping in time and frequency, on top of unresolved astrophysical and/or primordial backgrounds. Disentangling resolved sources from backgrounds and extracting their parameters in a computationally intensive "global fit" is normally regarded as a necessary step toward reconstructing the properties of the underlying astrophysical populations. Here, we show that it is in principle feasible to infer the population properties of the most numerous of LISA sources -- Galactic double white dwarfs -- directly from the frequency (or, equivalently, time) strain series by adopting a simulation-based approach, without extracting and estimating the parameters of each single source. By training a normalizing flow on a custom-designed compression of simulated LISA frequency series from the Galactic double white dwarf population, we demonstrate how to infer the posterior distribution of population parameters (e.g., mass function, frequency, and spatial distributions). This allows for extracting information on the population parameters from both resolved and unresolved sources simultaneously and in a computationally efficient manner. This approach can be extended to other source classes (e.g., massive and stellar-mass black holes, extreme mass ratio inspirals) and to scenarios involving non-Gaussian or non-stationary noise (e.g., data gaps), provided that fast and accurate simulations are available.

Paper Structure

This paper contains 16 sections, 9 equations, 12 figures, 3 tables.

Figures (12)

  • Figure 1: Flowchart depicting the SBI training pipeline. The yellow arrows indicate the steps involved in training the deep neural network, and the blue arrows represent the calibration procedure. In the posterior plots, examples of the posterior probability of the injections $p_{\mathrm{SBI}}(\boldsymbol{\Lambda}_\rm{GT} \vert \boldsymbol{D})$ are shown by the green (red) circles representing cases where the injection lies within (outside) the contours of nominal coverage 0.8. The corresponding empirical coverage is marked in the calibration PP plot. The figure illustrating the simulated population is adapted from LSIA_redbook_2024.
  • Figure 2: Top: probability density of double white dwarfs (DWDs) as a function of GW frequency. Bottom: frequency spectrum of DWD populations with different primary mass $m_1$ distribution (inset). The black curve shows the frequency response of the LISA A-channel. The different colors are for different primary-mass distributions (and all other parameters fixed at fiducial values): $m_\rm{0}, m_\rm{\gamma}$ = [0.65, 0.0485], [1.25, 0.0485], [0.65, 0.31] (orange, blue, gray respectively). Note that this is the amplitude spectral density of the channel, and not the strain amplitude.
  • Figure 3: Comparison of the log-absolute $\log_{10}\vert\boldsymbol{D}\vert$ (top) and real $\Re(\boldsymbol{D})$ (bottom) values of the $6\times10^{6}$ dimensional data (black) that are summarized by a piece-wise linear fit in 1024 frequency bins (red curve) and the $1\sigma$ residual contour (red shade).
  • Figure 4: Comparison of the $4\times1024$-dimensional data summaries for $\log_{10}\vert\boldsymbol{D}\vert$ and $\Re(\boldsymbol{D})$. Each frequency bin is summarized by four features: the slope and intercept of the linear fit, the standard deviation of the residual, and the $L_2$ norm of the residual outliers. Note that each feature has been scaled to a comparable range for stable training.
  • Figure 5: Empirical (i.e., observed) vs nominal coverage for the ten-dimensional posterior before (blue) and after (orange) calibration for the SBI trained on channel A (left) and E (right) data. The black dotted $45^\circ$ line shows the ideal relation. The optimal calibration temperatures $T^*$ for the two channels are 0.62 and 0.72, respectively.
  • ...and 7 more figures