Table of Contents
Fetching ...

Compact Binary Systems Waveform Generation with Generative Pre-trained Transformer

Ruijun Shi, Yue Zhou, Tianyu Zhao, Zhoujian Cao, Zhixiang Ren

TL;DR

This work tackles the challenge of generating and extrapolating space-based GW waveforms under the complex LISA TDI 2.0 response for MBHB, EMRI, and GB sources. It introduces CBS-GPT, an interpretable transformer with patching and hybrid embeddings trained via next-token prediction to extrapolate subsequent waveform segments, achieving high overlaps across two regimes, e.g., MBHB ~0.98–0.99, GB ~0.99+, and EMRIs around 0.91 (20:1) down to 0.81 (1:1). The study demonstrates strong interpretability through attention maps that correlate with waveform phase and frequency, and shows the model generalizes across parameter variations such as $M_{tot}$ and $ rac{m}{M}$, despite TDI 2.0 complexity. These results indicate large-transformer approaches can provide fast waveform generation and gap-imputation capabilities, with potential to inform template Bank construction and GW data-analysis pipelines in future space-based missions.

Abstract

Space-based gravitational wave (GW) detection is one of the most anticipated GW detection projects in the next decade, which promises to detect abundant compact binary systems. At present, deep learning methods have not been widely explored for GW waveform generation and extrapolation. To solve the data processing difficulty and the increasing waveform complexity caused by the detector's response and second-generation time-delay interferometry (TDI 2.0), an interpretable pre-trained large model named CBS-GPT (Compact Binary Systems Waveform Generation with Generative Pre-trained Transformer) is proposed. For compact binary system waveforms, three models were trained to predict the waveforms of massive black hole binaries (MBHB), extreme mass-ratio inspirals (EMRIs), and galactic binaries (GB), achieving prediction accuracies of at most 99%, 91%, and 99%, respectively. The CBS-GPT model exhibits notable generalization and interpretability, with its hidden parameters effectively capturing the intricate information of waveforms, even with the complex instrument response and a wide parameter range. Our research demonstrates the potential of large models in the GW realm, opening up new opportunities and guidance for future researches such as complex waveforms generation, gap completion, and deep learning model design for GW science.

Compact Binary Systems Waveform Generation with Generative Pre-trained Transformer

TL;DR

This work tackles the challenge of generating and extrapolating space-based GW waveforms under the complex LISA TDI 2.0 response for MBHB, EMRI, and GB sources. It introduces CBS-GPT, an interpretable transformer with patching and hybrid embeddings trained via next-token prediction to extrapolate subsequent waveform segments, achieving high overlaps across two regimes, e.g., MBHB ~0.98–0.99, GB ~0.99+, and EMRIs around 0.91 (20:1) down to 0.81 (1:1). The study demonstrates strong interpretability through attention maps that correlate with waveform phase and frequency, and shows the model generalizes across parameter variations such as and , despite TDI 2.0 complexity. These results indicate large-transformer approaches can provide fast waveform generation and gap-imputation capabilities, with potential to inform template Bank construction and GW data-analysis pipelines in future space-based missions.

Abstract

Space-based gravitational wave (GW) detection is one of the most anticipated GW detection projects in the next decade, which promises to detect abundant compact binary systems. At present, deep learning methods have not been widely explored for GW waveform generation and extrapolation. To solve the data processing difficulty and the increasing waveform complexity caused by the detector's response and second-generation time-delay interferometry (TDI 2.0), an interpretable pre-trained large model named CBS-GPT (Compact Binary Systems Waveform Generation with Generative Pre-trained Transformer) is proposed. For compact binary system waveforms, three models were trained to predict the waveforms of massive black hole binaries (MBHB), extreme mass-ratio inspirals (EMRIs), and galactic binaries (GB), achieving prediction accuracies of at most 99%, 91%, and 99%, respectively. The CBS-GPT model exhibits notable generalization and interpretability, with its hidden parameters effectively capturing the intricate information of waveforms, even with the complex instrument response and a wide parameter range. Our research demonstrates the potential of large models in the GW realm, opening up new opportunities and guidance for future researches such as complex waveforms generation, gap completion, and deep learning model design for GW science.
Paper Structure (16 sections, 26 equations, 6 figures, 3 tables)

This paper contains 16 sections, 26 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Overview of CBS-GPT. The CBS-GPT model was trained separately for three kinds of GW sources (MBHB, EMRIs, and GB). The subsequent waveform can be extrapolated after feeding its corresponding preceding waveform into CBS-GPT. Details of data and model description are in Section \ref{['sec:met']}.
  • Figure 2: TDI2.0 response complicates waveforms. To simplify waveform comparison here, all waveforms were standardized to a maximum amplitude of 1. The $\Delta t$ in the figure represents the sampling rate. The effects of different parameters on time and frequency domain are shown on the left and right panels. (a) MBHB waveforms at different $M_{tot}$. At high frequencies, the TDI response function has a greater impact. The gray line represents the TDI 2.0 transfer function in the frequency domain. (b) EMRIs waveforms at different $e_0$. As the eccentricity increases, the EMRIs waveform becomes more and more complex in the frequency domain. (c) GB waveforms at different $f$. The GB signal is relatively simple and is a single-frequency signal.
  • Figure 3: The overlap distribution of MBHB is shown in (a). (b, c, d) portray the heat maps of $M_{tot}$ and $\chi_{\mathrm{eff}}$ parameters, which have the greatest impact on overlap. A darker color corresponds a higher overlap value.
  • Figure 4: The overlap distributions of EMRIs and GB are shown in (a, d). (b, c) portray the heat maps of $e_0$ and $M$ parameters, which have the greatest impact on overlap of EMRIs. Similarly, (e, f) portray the heat maps of frequency parameter $f$, which have the greatest impact on overlap of GB. A darker color corresponds a higher overlap value.
  • Figure 5: CBS-GPT prediction results. (a, b) MBHB results. (c, d) EMRIs results. (e, f) GB results. (g) Generalization results of MBHB waveform with $1/q \approx$ 10, 40, 70, and 100, respectively. We set the predicted starting point at time zero. The blue line represents the conjunction of the last part of the input waveform and target label, the orange line is the predicted waveform, and the gray line is the difference between the predicted and target waveform. The inset figure in each subfigures represents the anticipated and target waveforms in the frequency domain, as well as the differences between them.
  • ...and 1 more figures