Compact Binary Systems Waveform Generation with Generative Pre-trained Transformer
Ruijun Shi, Yue Zhou, Tianyu Zhao, Zhoujian Cao, Zhixiang Ren
TL;DR
This work tackles the challenge of generating and extrapolating space-based GW waveforms under the complex LISA TDI 2.0 response for MBHB, EMRI, and GB sources. It introduces CBS-GPT, an interpretable transformer with patching and hybrid embeddings trained via next-token prediction to extrapolate subsequent waveform segments, achieving high overlaps across two regimes, e.g., MBHB ~0.98–0.99, GB ~0.99+, and EMRIs around 0.91 (20:1) down to 0.81 (1:1). The study demonstrates strong interpretability through attention maps that correlate with waveform phase and frequency, and shows the model generalizes across parameter variations such as $M_{tot}$ and $rac{m}{M}$, despite TDI 2.0 complexity. These results indicate large-transformer approaches can provide fast waveform generation and gap-imputation capabilities, with potential to inform template Bank construction and GW data-analysis pipelines in future space-based missions.
Abstract
Space-based gravitational wave (GW) detection is one of the most anticipated GW detection projects in the next decade, which promises to detect abundant compact binary systems. At present, deep learning methods have not been widely explored for GW waveform generation and extrapolation. To solve the data processing difficulty and the increasing waveform complexity caused by the detector's response and second-generation time-delay interferometry (TDI 2.0), an interpretable pre-trained large model named CBS-GPT (Compact Binary Systems Waveform Generation with Generative Pre-trained Transformer) is proposed. For compact binary system waveforms, three models were trained to predict the waveforms of massive black hole binaries (MBHB), extreme mass-ratio inspirals (EMRIs), and galactic binaries (GB), achieving prediction accuracies of at most 99%, 91%, and 99%, respectively. The CBS-GPT model exhibits notable generalization and interpretability, with its hidden parameters effectively capturing the intricate information of waveforms, even with the complex instrument response and a wide parameter range. Our research demonstrates the potential of large models in the GW realm, opening up new opportunities and guidance for future researches such as complex waveforms generation, gap completion, and deep learning model design for GW science.
