Table of Contents
Fetching ...

An Analysis of LIGO Glitches Using t-SNE During the First Part of the Fourth LIGO-Virgo-KAGRA Observing Run

Tabata Aira Ferreira, Gabriela González, Osvaldo Salas

TL;DR

This work addresses the challenge of characterizing non-astrophysical glitches in LIGO data by applying an unsupervised pipeline that reduces high-dimensional Omicron glitch features to a two-dimensional representation using t-SNE, followed by Agglomerative Clustering with a Silhouette-based criterion to define distinct glitch groups. By analyzing the temporal evolution of these groups during O4a at LLO and LHO, the study demonstrates that glitch morphology and occurrence correlate with environmental and instrumental factors: at LLO, low-frequency glitches (L1/L3) track microseismic ground motion in bands $0.1-0.3$ Hz and $0.3-1$ Hz, while at LHO, broadband glitches in the $20-50$ Hz range are linked to the electrostatic drive (ESD) system. The key contributions are the identification and temporal tracking of glitch groups, quantification of their correlations with ground motion or ESD, and demonstration of how detector controls and environmental conditions shape glitch populations. The approach enhances glitch-origin understanding and data-quality mitigation, with broad applicability to LVK detectors and future observing runs.

Abstract

This paper presents an analysis of noise transients observed in LIGO data during the first part of the fourth observing run, using the unsupervised machine learning technique t-distributed Stochastic Neighbor Embedding (t-SNE) to examine the behavior of glitch groups. Based on the t-SNE output, we apply Agglomerative Clustering in combination with the Silhouette Score to determine the optimal number of groups. We then track these groups over time and investigate correlations between their occurrence and environmental or instrumental conditions. At the Livingston observatory, the most common glitches during O4a were seasonal and associated with ground motion, whereas at Hanford, the most prevalent glitches were related to instrumental conditions.

An Analysis of LIGO Glitches Using t-SNE During the First Part of the Fourth LIGO-Virgo-KAGRA Observing Run

TL;DR

This work addresses the challenge of characterizing non-astrophysical glitches in LIGO data by applying an unsupervised pipeline that reduces high-dimensional Omicron glitch features to a two-dimensional representation using t-SNE, followed by Agglomerative Clustering with a Silhouette-based criterion to define distinct glitch groups. By analyzing the temporal evolution of these groups during O4a at LLO and LHO, the study demonstrates that glitch morphology and occurrence correlate with environmental and instrumental factors: at LLO, low-frequency glitches (L1/L3) track microseismic ground motion in bands Hz and Hz, while at LHO, broadband glitches in the Hz range are linked to the electrostatic drive (ESD) system. The key contributions are the identification and temporal tracking of glitch groups, quantification of their correlations with ground motion or ESD, and demonstration of how detector controls and environmental conditions shape glitch populations. The approach enhances glitch-origin understanding and data-quality mitigation, with broad applicability to LVK detectors and future observing runs.

Abstract

This paper presents an analysis of noise transients observed in LIGO data during the first part of the fourth observing run, using the unsupervised machine learning technique t-distributed Stochastic Neighbor Embedding (t-SNE) to examine the behavior of glitch groups. Based on the t-SNE output, we apply Agglomerative Clustering in combination with the Silhouette Score to determine the optimal number of groups. We then track these groups over time and investigate correlations between their occurrence and environmental or instrumental conditions. At the Livingston observatory, the most common glitches during O4a were seasonal and associated with ground motion, whereas at Hanford, the most prevalent glitches were related to instrumental conditions.

Paper Structure

This paper contains 5 sections, 26 figures, 2 tables.

Figures (26)

  • Figure 1: Histogram of frequencies of glitches during O4a, with SNR $\geq7.5$. This shows the range where we observed the most common glitches at LLO during O4a, which occurred between 10 and 100.
  • Figure 2: Characteristic strains during O3 in blue and O4a in pink, measured under similar microseismic motion amplitudes (or velocities). The blue circles highlight the impact of scattering glitches on the 18–30 Hz frequency band during O3, which was significantly reduced soni2020reducing in O4a (pink squares).
  • Figure 3: Average hourly glitch rates for each month during O4a. Dark blue bars, annotated with numerical values, represent the total rate of clustered Omicron transients with frequencies between 10and 2048 and a minimum SNR of 7.5. Light blue bars with hatching indicate the subset of glitches with frequencies below 25.
  • Figure 4: Workflow for constructing the vectors used as input to t-SNE, illustrated for one example glitch. (a) Omicron triggers in a $\pm0.5$,s window with the 30$\times$41 time–frequency grid. (b) Pixelized glitchgram matrix with normalized SNR. (c) Flattened 1,230-dimensional vector, shown using the same colormap scale as in panel (b), used as input to t-SNE and clustering.
  • Figure 5: (a) Scatter plot of the 2D t-SNE output coordinates representing 13,500 data points, each originally embedded in a 1,230-dimensional space derived from unclustered Omicron triggers. Each point corresponds to a glitch. Histograms along the top and right axes illustrate the distribution of glitches along each coordinate. (b) The same t-SNE output, with points colored according to the classes assigned by Agglomerative Clustering.
  • ...and 21 more figures