Table of Contents
Fetching ...

Flexible Gravitational-Wave Parameter Estimation with Transformers

Annalena Kofler, Maximilian Dax, Stephen R. Green, Jonas Wildberger, Nihar Gupte, Jakob H. Macke, Jonathan Gair, Alessandra Buonanno, Bernhard Schölkopf

TL;DR

The paper tackles the need for flexible, scalable gravitational-wave parameter estimation as data conditions vary across detectors and frequency ranges. It introduces Dingo-T1, a transformer-based encoder coupled with a normalizing flow, enabling full amortization to handle missing data and diverse analysis settings with a single model. Through extensive tests on LVK O3 data, it demonstrates improved sample efficiency, rapid inference, and the ability to perform IMR consistency tests, highlighting significant practical gains for real-time and catalog-level GW analyses. The work paves the way for generalized, task-agnostic GW inference and robust handling of incomplete data in current and future observatories.

Abstract

Gravitational-wave data analysis relies on accurate and efficient methods to extract physical information from noisy detector signals, yet the increasing rate and complexity of observations represent a growing challenge. Deep learning provides a powerful alternative to traditional inference, but existing neural models typically lack the flexibility to handle variations in data analysis settings. Such variations accommodate imperfect observations or are required for specialized tests, and could include changes in detector configurations, overall frequency ranges, or localized cuts. We introduce a flexible transformer-based architecture paired with a training strategy that enables adaptation to diverse analysis settings at inference time. Applied to parameter estimation, we demonstrate that a single flexible model -- called Dingo-T1 -- can (i) analyze 48 gravitational-wave events from the third LIGO-Virgo-KAGRA Observing Run under a wide range of analysis configurations, (ii) enable systematic studies of how detector and frequency configurations impact inferred posteriors, and (iii) perform inspiral-merger-ringdown consistency tests probing general relativity. Dingo-T1 also improves median sample efficiency on real events from a baseline of 1.4% to 4.2%. Our approach thus demonstrates flexible and scalable inference with a principled framework for handling missing or incomplete data -- key capabilities for current and next-generation observatories.

Flexible Gravitational-Wave Parameter Estimation with Transformers

TL;DR

The paper tackles the need for flexible, scalable gravitational-wave parameter estimation as data conditions vary across detectors and frequency ranges. It introduces Dingo-T1, a transformer-based encoder coupled with a normalizing flow, enabling full amortization to handle missing data and diverse analysis settings with a single model. Through extensive tests on LVK O3 data, it demonstrates improved sample efficiency, rapid inference, and the ability to perform IMR consistency tests, highlighting significant practical gains for real-time and catalog-level GW analyses. The work paves the way for generalized, task-agnostic GW inference and robust handling of incomplete data in current and future observatories.

Abstract

Gravitational-wave data analysis relies on accurate and efficient methods to extract physical information from noisy detector signals, yet the increasing rate and complexity of observations represent a growing challenge. Deep learning provides a powerful alternative to traditional inference, but existing neural models typically lack the flexibility to handle variations in data analysis settings. Such variations accommodate imperfect observations or are required for specialized tests, and could include changes in detector configurations, overall frequency ranges, or localized cuts. We introduce a flexible transformer-based architecture paired with a training strategy that enables adaptation to diverse analysis settings at inference time. Applied to parameter estimation, we demonstrate that a single flexible model -- called Dingo-T1 -- can (i) analyze 48 gravitational-wave events from the third LIGO-Virgo-KAGRA Observing Run under a wide range of analysis configurations, (ii) enable systematic studies of how detector and frequency configurations impact inferred posteriors, and (iii) perform inspiral-merger-ringdown consistency tests probing general relativity. Dingo-T1 also improves median sample efficiency on real events from a baseline of 1.4% to 4.2%. Our approach thus demonstrates flexible and scalable inference with a principled framework for handling missing or incomplete data -- key capabilities for current and next-generation observatories.

Paper Structure

This paper contains 9 sections, 8 equations, 10 figures, 7 tables.

Figures (10)

  • Figure 1: Overview of Dingo-T1. After multibanding, data and detector PSD segments $(d^{(k)}, S_n^{(k)})$ are mapped by a shared tokenizer into tokens that also encode the frequency range $(f^{(k)}, f^{(k+1)})$ and detector identity $I$. The sequence of tokens is masked, augmented with a learnable summary token, and processed by the transformer encoder. Through self-attention, the summary token aggregates information from all unmasked segments. The final summary token is projected via a linear layer to a 128-dimensional feature vector, which conditions a normalizing flow to model the posterior $p(\theta|d, S_\mathrm{n})$ over source parameters.
  • Figure 2: (a) Sample efficiency distribution of 1000 simulated events per detector configuration for the Dingo-T1 model, shown as violin plots. (b) Sample efficiency distribution of Dingo-T1 and the Dingo baseline across 48 real GW events. Dots denote 3-detector events, while circles refer to 2-detector events which cannot be analyzed with the unflexible Dingo baseline. The dashed line represents the median, the dotted lines the quartiles. The efficiencies between simulated (a) and real data (b) are not directly comparable since the parameters are drawn from different distributions.
  • Figure 3: (a) Posterior distribution for GW190701_203306, showing Dingo-T1 analyses for three different detector configurations. (b) Inspiral-merger-ringdown consistency tests for seven events where the signal was analyzed with the Dingo-T1 model on the inspiral and postinspiral part of the signal. The main panel shows the 90% credible regions of the 2D posteriors on $(\Delta M_f / \overline{M}_f, \Delta \chi_f / \overline{\chi}_f)$ and the side panels show the marginalized posteriors. The color corresponds to the median redshifted total mass obtained from posterior samples of the full signal.
  • Figure 4: The 48 GW events considered in this study are analyzed in 17 different data analysis settings in O3 LVK catalogs ligo_gwtc-21:2024ligo_gwtc-3:2023. Each event is based on data from a subset of the three detectors (HLV) and spans frequency ranges that vary with data quality and source properties. The Dingo-T1 model accommodates all of these settings with a single neural network.
  • Figure 5: Comparison of (a) random and (b) data-based masking across 100 samples. Data-based masking jointly removes tokens from the same detector, yielding structured patterns, while random masking produces unstructured, scattered masks.
  • ...and 5 more figures