Table of Contents
Fetching ...

Towards an anomaly detection pipeline for gravitational waves at the Einstein Telescope

Gianluca Inguglia, Huw Haigh, Kristyna Vitulova, Ulyana Dupletsa

TL;DR

The paper reframes gravitational-wave searches as anomaly-detection problems by training a convolutional autoencoder on noise-only time–frequency spectrograms from a single Einstein Telescope detector. Short, low-frequency bursts, such as IMBH-related mergers, are targeted, with anomalies identified through reconstruction-error thresholds, enabling model-independent detection without relying on waveform templates. Weak supervision is introduced by injecting GW signals and adding a separation loss, which dramatically improves performance: unsupervised training yields about 23% recovery for IMBH-merging signals, while weakly supervised training recovers 100% of IMBH-involving mergers in the MDC dataset, with a false-alarm rate of roughly 4.5 events per year for a single detector at 100% duty cycle. The results demonstrate a promising, scalable, model-independent framework for automated GW searches, paving the way for fully adaptive, multi-detector anomaly-detection pipelines, though challenges such as glitches and source localization/classification remain to be addressed.

Abstract

We present the implementation of an anomaly-detection algorithm based on a deep convolutional autoencoder for the search for gravitational waves (GWs) in time-frequency spectrograms. Our method targets short-duration ($\lesssim 2\,\text{s}$) GW signals, exemplified by mergers of compact objects forming or involving an intermediate-mass black hole (IMBH). Such short signals are difficult to distinguish from background noise; yet their brevity makes them well-suited to machine-learning analyses with modest computational requirements. Using the data from the Einstein Telescope Mock Data Challenge as a benchmark, we demonstrate that the approach can successfully flag GW-like transients as anomalies in interferometer data of a single detector, achieving an initial detection efficiency of 23% for injected signals corresponding to IMBH-forming mergers. After introducing weak supervision, the model exhibits excellent generalisation and recovers all injected IMBH-forming mergers, independent of their total mass or signal-to-noise ratio, with a false-alarm rate due to statistical noise fluctuations of approximately 4.5 events per year for a single interferometer operating with a 100% duty cycle. The method also successfully identifies lower-mass mergers leading to the formation of black holes with mass larger than $\simeq 20\,M_\odot$. Our pipeline does not yet classify anomalies, distinguishing between actual GW signals and noise artefacts; however, it highlights any deviation from the learned background noise distribution for further scrutiny. These results demonstrate that anomaly detection offers a powerful, model-independent framework for future GW searches, paving the way toward fully automated and adaptive analysis pipelines.

Towards an anomaly detection pipeline for gravitational waves at the Einstein Telescope

TL;DR

The paper reframes gravitational-wave searches as anomaly-detection problems by training a convolutional autoencoder on noise-only time–frequency spectrograms from a single Einstein Telescope detector. Short, low-frequency bursts, such as IMBH-related mergers, are targeted, with anomalies identified through reconstruction-error thresholds, enabling model-independent detection without relying on waveform templates. Weak supervision is introduced by injecting GW signals and adding a separation loss, which dramatically improves performance: unsupervised training yields about 23% recovery for IMBH-merging signals, while weakly supervised training recovers 100% of IMBH-involving mergers in the MDC dataset, with a false-alarm rate of roughly 4.5 events per year for a single detector at 100% duty cycle. The results demonstrate a promising, scalable, model-independent framework for automated GW searches, paving the way for fully adaptive, multi-detector anomaly-detection pipelines, though challenges such as glitches and source localization/classification remain to be addressed.

Abstract

We present the implementation of an anomaly-detection algorithm based on a deep convolutional autoencoder for the search for gravitational waves (GWs) in time-frequency spectrograms. Our method targets short-duration () GW signals, exemplified by mergers of compact objects forming or involving an intermediate-mass black hole (IMBH). Such short signals are difficult to distinguish from background noise; yet their brevity makes them well-suited to machine-learning analyses with modest computational requirements. Using the data from the Einstein Telescope Mock Data Challenge as a benchmark, we demonstrate that the approach can successfully flag GW-like transients as anomalies in interferometer data of a single detector, achieving an initial detection efficiency of 23% for injected signals corresponding to IMBH-forming mergers. After introducing weak supervision, the model exhibits excellent generalisation and recovers all injected IMBH-forming mergers, independent of their total mass or signal-to-noise ratio, with a false-alarm rate due to statistical noise fluctuations of approximately 4.5 events per year for a single interferometer operating with a 100% duty cycle. The method also successfully identifies lower-mass mergers leading to the formation of black holes with mass larger than . Our pipeline does not yet classify anomalies, distinguishing between actual GW signals and noise artefacts; however, it highlights any deviation from the learned background noise distribution for further scrutiny. These results demonstrate that anomaly detection offers a powerful, model-independent framework for future GW searches, paving the way toward fully automated and adaptive analysis pipelines.

Paper Structure

This paper contains 10 sections, 14 equations, 12 figures.

Figures (12)

  • Figure 1: Theoretical frequency $f_{220}$ as a function of redshift for binary black hole mergers assuming different total source-frame masses Berti:2005ys, calculated using $\alpha(a_f) \simeq 0.536$ and $\varepsilon = 0.07$Keitel:2016krmHealy:2014eua. See \ref{['app:qnm_freq']} for further details. The red points indicate IMBH candidates detected via GWs, including GW190521 LIGOScientific:2020iuhLIGOScientific:2020ufj and GW231123 LIGOScientific:2025rsn. The yellow dashed lines indicate the lowest frequency limit of the LIGO-Virgo-KAGRA detectors network and of the Einstein Telescope Abbott:2020lrrAbac:2025saz.
  • Figure 2: Schematic of the convolutional autoencoder (CAE) architecture adopted for anomaly detection of gravitational waves. The input data — spectrograms of two-second segments of noise strain — are encoded into a latent representation through three convolutional layers, and subsequently decoded via mirrored deconvolutional layers to reconstruct the input. The CAE is trained on noise-only data, enabling it to detect anomalous spectrograms (e.g., containing GW transients) through deviations in reconstruction error.
  • Figure 3: An example of the ET noise-only spectrograms used in the training and testing of the convolutional autoencoder. Note that the y-axis (frequency) is scaled by log$_2$ to highlight the lower frequency region that is particularly of interest when searching for IMBH mergers. For illustrative purposes, this spectrogram has not been whitened or normalised.
  • Figure 4: The distribution of reconstruction MSE found in all spectrograms in the ET noise samples. The distribution is shown in logarithmic scale with the fitted Gaussian distribution shown in red. The distribution begins to deviate from the Gaussian modelling in the far extreme of the tail; in this region the data were instead fit with a generalised pareto function in order to extract a 5$\sigma$ threshold.
  • Figure 5: Distributions of the reconstruction MSE from the training (blue) and testing (black points) ET noise-only spectrograms. The ratio of the two distributions is shown in the bottoms section, the flat distribution indicates that the model generalises well to unseen input data.
  • ...and 7 more figures