Table of Contents
Fetching ...

Dilated convolutional neural network for detecting extreme-mass-ratio inspirals

Tianyu Zhao, Yue Zhou, Ruijun Shi, Zhoujian Cao, Zhixiang Ren

TL;DR

This work targets detection of extreme mass ratio inspirals (EMRIs) in space-based gravitational-wave data by introducing DECODE, an end-to-end, frequency-domain detector based on a dilated causal convolutional network. By training on synthetic TDI-1.5 data and processing year-long multichannel streams, DECODE achieves high detection rates (e.g., true positive rate around $96$–$97\%$ at a false alarm rate of $1\%$) with inference times under $0.01$ s per sample, across accumulated SNRs from $50$ to $240$. The method leverages causal, dilated convolutions to capture long-range dependencies in frequency domain data, and demonstrates interpretability via activation maps and generalization across waveform models (AAK, AK, XSPEG). The study highlights the potential of fast, robust EMRI detection in future space-based GW analyses and outlines paths for improvement, such as incorporating TDI-2.0 and phase information to further boost performance.

Abstract

The detection of Extreme Mass Ratio Inspirals (EMRIs) is intricate due to their complex waveforms, extended duration, and low signal-to-noise ratio (SNR), making them more challenging to be identified compared to compact binary coalescences. While matched filtering-based techniques are known for their computational demands, existing deep learning-based methods primarily handle time-domain data and are often constrained by data duration and SNR. In addition, most existing work ignores time-delay interferometry (TDI) and applies the long-wavelength approximation in detector response calculations, thus limiting their ability to handle laser frequency noise. In this study, we introduce DECODE, an end-to-end model focusing on EMRI signal detection by sequence modeling in the frequency domain. Centered around a dilated causal convolutional neural network, trained on synthetic data considering TDI-1.5 detector response, DECODE can efficiently process a year's worth of multichannel TDI data with an SNR of around 50. We evaluate our model on 1-year data with accumulated SNR ranging from 50 to 120 and achieve a true positive rate of 96.3% at a false positive rate of 1%, keeping an inference time of less than 0.01 seconds. With the visualization of three showcased EMRI signals for interpretability and generalization, DECODE exhibits strong potential for future space-based gravitational wave data analyses.

Dilated convolutional neural network for detecting extreme-mass-ratio inspirals

TL;DR

This work targets detection of extreme mass ratio inspirals (EMRIs) in space-based gravitational-wave data by introducing DECODE, an end-to-end, frequency-domain detector based on a dilated causal convolutional network. By training on synthetic TDI-1.5 data and processing year-long multichannel streams, DECODE achieves high detection rates (e.g., true positive rate around at a false alarm rate of ) with inference times under s per sample, across accumulated SNRs from to . The method leverages causal, dilated convolutions to capture long-range dependencies in frequency domain data, and demonstrates interpretability via activation maps and generalization across waveform models (AAK, AK, XSPEG). The study highlights the potential of fast, robust EMRI detection in future space-based GW analyses and outlines paths for improvement, such as incorporating TDI-2.0 and phase information to further boost performance.

Abstract

The detection of Extreme Mass Ratio Inspirals (EMRIs) is intricate due to their complex waveforms, extended duration, and low signal-to-noise ratio (SNR), making them more challenging to be identified compared to compact binary coalescences. While matched filtering-based techniques are known for their computational demands, existing deep learning-based methods primarily handle time-domain data and are often constrained by data duration and SNR. In addition, most existing work ignores time-delay interferometry (TDI) and applies the long-wavelength approximation in detector response calculations, thus limiting their ability to handle laser frequency noise. In this study, we introduce DECODE, an end-to-end model focusing on EMRI signal detection by sequence modeling in the frequency domain. Centered around a dilated causal convolutional neural network, trained on synthetic data considering TDI-1.5 detector response, DECODE can efficiently process a year's worth of multichannel TDI data with an SNR of around 50. We evaluate our model on 1-year data with accumulated SNR ranging from 50 to 120 and achieve a true positive rate of 96.3% at a false positive rate of 1%, keeping an inference time of less than 0.01 seconds. With the visualization of three showcased EMRI signals for interpretability and generalization, DECODE exhibits strong potential for future space-based gravitational wave data analyses.
Paper Structure (20 sections, 8 equations, 5 figures, 2 tables)

This paper contains 20 sections, 8 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: Visualization of a training data sample. This depicts an EMRI signal from the TDI-A channel spanning 1-year with an SNR of 70. (a), Time-domain representation of the TDI-A strain, showcasing both the combined data (signal + noise) and the signal. The signal's amplitude is about 3 orders of magnitudes lower than the noise, which makes the detection challenging. (b), Welch PSD of the combined data and the signal, the signal contains lots of modes (peaks), with some reaching the noise level, highlighting the suitability of the frequency domain detection method. The designed detector noise PSD is also presented for reference.
  • Figure 2: Comprehensive EMRI detection framework. (a), Depicts the entire EMRI detection process, from initial data preprocessing to the end-to-end DECODE model. (b), Highlights the mechanism of dilated causal convolution with dilation factors of $(1, 2, 4, 8)$ and a kernel size of 2, emphasizing the exponential growth of the receptive field. (c), Detailed architecture of the residual block in DECODE, comprising two dilated causal convolutional layers, weight normalization, ReLU, and dropout layers. A $1\times1$ convolution is introduced to address any dimension discrepancies between the residual input and output.
  • Figure 3: EMRI detection performance across SNR and $\bm{N}$. All sub-plots depict ROC curves for distinct input sample lengths $N$ within specific SNR ranges, presented on a logarithmic scale. Each line style signifies the balance between TPR and FPR for a given sample length, with the area beneath each curve representing the model's efficacy. A reference yellow dashed line indicates the random prediction. The use of logarithmic scales enhances the visibility of performance difference, especially at lower FPR levels. (a), Evaluation for $\mathrm{SNR} \in [50, 120]$. (b), Evaluation for $\mathrm{SNR} \in [70, 170]$. (c), Evaluation for $\mathrm{SNR} \in [100, 240]$.
  • Figure 4: Detection capability of DECODE across various parameters. (a), Illustrates the TPR as a function of SNR, highlighting the model's capability to detect signals with varying strengths. (b) Showcases the TPR plotted against the relative amplitude $\mathcal{A}$ (defined in \ref{['eq:amp']}), emphasizing the model's ability to detect power excesses in the frequency domain and detect signals even when they are submerged within the noise. (c) Explores the TPR in relation to the spin parameter $a$, keeping the MBH mass consistent at $10^6 M_{\odot}$. This sub-figure is evaluated at three distinct SNR levels: 50, 70, and 100, shedding light on the relationship between spin parameters and detection capabilities.
  • Figure 5: Interpretability and generalization ability showcase. This figure provides an in-depth visualization of the intermediate outputs from each residual block, demonstrating the model's capability for feature extraction within the frequency domain and it's generalization ability to different waveform templates and gravitational theories. For each sub-figure, panels i and ii represent the intermediate results corresponding to the input data samples shown in panels iii and iv. In contrast to the faint activations in panel ii, the noticeable activated neurons in panel i indicate the extraction of essential characteristics when a signal is present in the input. (a), AAK waveform. (b), AK waveform. (c), XSPEG waveform.