Cross-Temporal Spectrogram Autoencoder (CTSAE): Unsupervised Dimensionality Reduction for Clustering Gravitational Wave Glitches
Yi Li, Yunan Wu, Aggelos K. Katsaggelos
TL;DR
CTSAE presents an unsupervised, four-branch CNN–ViT autoencoder that processes four time-window spectrograms of LIGO glitches and uses a shared CLS token with a CLS Fusion Module to fuse cross-branch information, yielding a discriminative latent code $\hat{z}$ for clustering. Trained with per-branch reconstruction losses $L = \sum L_{mse}(I_i, \hat{I_i})$, the model demonstrates superior clustering performance (via NMI and ARI) on Gravity Spy O3 main-channel data relative to semi-supervised baselines, despite no ground-truth labels during training. The study shows that multi-branch architecture, CNN–ViT fusion, and CLS-based cross-branch communication are key to capturing both global and local glitch patterns across timescales, with reconstruction quality indicating faithful preservation of glitch structure. This unsupervised approach offers a scalable pathway for glitch identification in upcoming Gravity Spy 2.0 data across main and auxiliary channels, reducing dependence on manual labeling and enabling robust gravitational-wave detection pipelines.
Abstract
The advancement of The Laser Interferometer Gravitational-Wave Observatory (LIGO) has significantly enhanced the feasibility and reliability of gravitational wave detection. However, LIGO's high sensitivity makes it susceptible to transient noises known as glitches, which necessitate effective differentiation from real gravitational wave signals. Traditional approaches predominantly employ fully supervised or semi-supervised algorithms for the task of glitch classification and clustering. In the future task of identifying and classifying glitches across main and auxiliary channels, it is impractical to build a dataset with manually labeled ground-truth. In addition, the patterns of glitches can vary with time, generating new glitches without manual labels. In response to this challenge, we introduce the Cross-Temporal Spectrogram Autoencoder (CTSAE), a pioneering unsupervised method for the dimensionality reduction and clustering of gravitational wave glitches. CTSAE integrates a novel four-branch autoencoder with a hybrid of Convolutional Neural Networks (CNN) and Vision Transformers (ViT). To further extract features across multi-branches, we introduce a novel multi-branch fusion method using the CLS (Class) token. Our model, trained and evaluated on the GravitySpy O3 dataset on the main channel, demonstrates superior performance in clustering tasks when compared to state-of-the-art semi-supervised learning methods. To the best of our knowledge, CTSAE represents the first unsupervised approach tailored specifically for clustering LIGO data, marking a significant step forward in the field of gravitational wave research. The code of this paper is available at https://github.com/Zod-L/CTSAE
