Table of Contents
Fetching ...

Robust and scalable simulation-based inference for gravitational wave signals with gaps

Ruiting Mao, Jeong Eun Lee, Matthew C. Edwards

TL;DR

This work tackles the challenge of parameter inference from gapped LISA time-series by introducing a scalable simulation-based inference pipeline that uses Flow Matching Posterior Estimation (FMPE). It combines a dual-pathway summarizer (time-domain 1D-CNN and wavelet-based 2D-CNN with asymmetric, dilated kernels) trained end-to-end with the FMPE network to produce calibrated posteriors from highly artifact-laden data. The results show that end-to-end training yields tighter, unbiased posteriors compared to two-stage approaches and that FMPE offers superior stability and coverage calibration over conventional normalizing flows in the presence of data gaps. The approach demonstrates scalability to long-duration signals (e.g., 90 days) and provides a practical framework for robust global-fit inference of Galactic binaries and other LISA sources in realistic, noisy, and incomplete data.

Abstract

The Laser Interferometer Space Antenna (LISA) data stream will inevitably contain gaps due to maintenance and environmental disturbances, introducing nonstationarities and spectral leakage that compromise standard frequency-domain likelihood evaluations. We present a scalable Simulation-Based Inference (SBI) framework capable of robust parameter estimation directly from gapped time-series data. We employ Flow Matching Posterior Estimation (FMPE) conditioned on a learned summary of the data, optimized through an end-to-end training strategy. To address the computational challenges of long-duration signals, we propose a dual-pathway summarizer architecture: a 1D Convolutional Neural Network (CNN) operating on the time domain for high precision, and a novel wavelet-based 2D CNN utilizing asymmetric, dilated kernels to achieve scalability for datasets spanning months. We demonstrate the efficacy of this framework on simulated Galactic Binary-like signals, showing that our joint training approach yields tighter, unbiased posteriors compared to two-stage reconstruction pipelines. Furthermore, we provide the first systematic comparison showing that FMPE offers superior stability and coverage calibration over conventional Normalizing Flows in the presence of severe data artifacts.

Robust and scalable simulation-based inference for gravitational wave signals with gaps

TL;DR

This work tackles the challenge of parameter inference from gapped LISA time-series by introducing a scalable simulation-based inference pipeline that uses Flow Matching Posterior Estimation (FMPE). It combines a dual-pathway summarizer (time-domain 1D-CNN and wavelet-based 2D-CNN with asymmetric, dilated kernels) trained end-to-end with the FMPE network to produce calibrated posteriors from highly artifact-laden data. The results show that end-to-end training yields tighter, unbiased posteriors compared to two-stage approaches and that FMPE offers superior stability and coverage calibration over conventional normalizing flows in the presence of data gaps. The approach demonstrates scalability to long-duration signals (e.g., 90 days) and provides a practical framework for robust global-fit inference of Galactic binaries and other LISA sources in realistic, noisy, and incomplete data.

Abstract

The Laser Interferometer Space Antenna (LISA) data stream will inevitably contain gaps due to maintenance and environmental disturbances, introducing nonstationarities and spectral leakage that compromise standard frequency-domain likelihood evaluations. We present a scalable Simulation-Based Inference (SBI) framework capable of robust parameter estimation directly from gapped time-series data. We employ Flow Matching Posterior Estimation (FMPE) conditioned on a learned summary of the data, optimized through an end-to-end training strategy. To address the computational challenges of long-duration signals, we propose a dual-pathway summarizer architecture: a 1D Convolutional Neural Network (CNN) operating on the time domain for high precision, and a novel wavelet-based 2D CNN utilizing asymmetric, dilated kernels to achieve scalability for datasets spanning months. We demonstrate the efficacy of this framework on simulated Galactic Binary-like signals, showing that our joint training approach yields tighter, unbiased posteriors compared to two-stage reconstruction pipelines. Furthermore, we provide the first systematic comparison showing that FMPE offers superior stability and coverage calibration over conventional Normalizing Flows in the presence of severe data artifacts.

Paper Structure

This paper contains 29 sections, 20 equations, 13 figures, 1 table.

Figures (13)

  • Figure 1: Schematic overview of the proposed End-to-End Simulation-based inference framework. Top Left: The forward process generates synthetic signals with injected noise and gaps. Blue Box: The Summarizer Network processes these signals via a CNN and Multi-Layer Perceptron (MLP) to produce a low-dimensional summary statistic $s(\hat{d})$. Green Box: The Flow Matching network, conditioned on $s(\hat{d})$, is trained using residual blocks to minimize the flow matching loss $\mathcal{L}_{CFM}$. Orange lines and bottom orange Box: During inference for observed $\hat{d_o}$, the posterior is approximated by integrating the learned vector field from a base distribution $p(\theta_0)$ to the target posteriors $p(\theta_1|s(\hat{d_o}))$ using an ODE solver.
  • Figure 2: An illustration of the summarizer structure in both pathways. The right green represents pathway 1 for time-domain data and left orange describes pathway 2 for time-frequency domain data.
  • Figure 3: An illustration of the asymmetric and dilated convolutional kernel used in Pathway 2. The kernel's wider temporal dimension (width) compared to its frequency dimension (height) is designed to capture long-range correlations in the time-frequency spectrogram.
  • Figure 4: One realization of 30-day test signal with gaps in time domain
  • Figure 5: The comparison of the approximated posterior distribution inference by flow matching (FM) (orange) and masked autoregressive flow (MAF) (green) with Pathway 1 summarizer. The red lines indicate the true value.
  • ...and 8 more figures