Table of Contents
Fetching ...

Vision Transformer for Transient Noise Classification

Divyansh Srivastava, Andrzej Niedzielski

TL;DR

This work addresses the challenge of classifying transient glitches in LIGO data to improve gravitational-wave detection. It employs a pre-trained Vision Transformer ViT-B/32 to classify glitches on an extended Gravity Spy dataset that adds two O3a noise classes, resulting in 24 classes. The model achieves a test accuracy of $92.26\%$ and F1 score of $92.13\%$, with some classes performing exceptionally well and others challenging. This demonstrates ViT-based approaches are viable for gravitational-wave data and could enhance glitch discrimination, with future work to unfreeze the encoder to push performance further.

Abstract

Transient noise (glitches) in LIGO data hinders the detection of gravitational waves (GW). The Gravity Spy project has categorized these noise events into various classes. With the O3 run, there is the inclusion of two additional noise classes and thus a need to train new models for effective classification. We aim to classify glitches in LIGO data into 22 existing classes from the first run plus 2 additional noise classes from O3a using the Vision Transformer (ViT) model. We train a pre-trained Vision Transformer (ViT-B/32) model on a combined dataset consisting of the Gravity Spy dataset with the additional two classes from the LIGO O3a run. We achieve a classification efficiency of 92.26%, demonstrating the potential of Vision Transformer to improve the accuracy of gravitational wave detection by effectively distinguishing transient noise. Key words: gravitational waves --vision transformer --machine learning

Vision Transformer for Transient Noise Classification

TL;DR

This work addresses the challenge of classifying transient glitches in LIGO data to improve gravitational-wave detection. It employs a pre-trained Vision Transformer ViT-B/32 to classify glitches on an extended Gravity Spy dataset that adds two O3a noise classes, resulting in 24 classes. The model achieves a test accuracy of and F1 score of , with some classes performing exceptionally well and others challenging. This demonstrates ViT-based approaches are viable for gravitational-wave data and could enhance glitch discrimination, with future work to unfreeze the encoder to push performance further.

Abstract

Transient noise (glitches) in LIGO data hinders the detection of gravitational waves (GW). The Gravity Spy project has categorized these noise events into various classes. With the O3 run, there is the inclusion of two additional noise classes and thus a need to train new models for effective classification. We aim to classify glitches in LIGO data into 22 existing classes from the first run plus 2 additional noise classes from O3a using the Vision Transformer (ViT) model. We train a pre-trained Vision Transformer (ViT-B/32) model on a combined dataset consisting of the Gravity Spy dataset with the additional two classes from the LIGO O3a run. We achieve a classification efficiency of 92.26%, demonstrating the potential of Vision Transformer to improve the accuracy of gravitational wave detection by effectively distinguishing transient noise. Key words: gravitational waves --vision transformer --machine learning

Paper Structure

This paper contains 5 sections, 1 equation, 4 figures.

Figures (4)

  • Figure 1: Spectrograms showing different glitch classes from the first Gravity Spy Dataset and additional classes Blip_Low_Frequency, Fast_Scattering from O3a. The spectrograms for Blip_Low_Frequency and Fast_Scattering were $\pm$ 0.5 seconds around the events; zoomed in for identifiability.
  • Figure 2: The image is divided into patches, each of which is linearly embedded. Positional embeddings are added, and the resulting sequence of vectors is fed into a standard Transformer encoder. For classification, an additional classification token is appended to the sequence. The MLP head then classifies the output into one of 24 classes. Illustration is inspired by dosovitskiy2021imageworth16x16words.
  • Figure 3: Training and validation accuracy and loss over 15 epochs.
  • Figure 4: Confusion Matrix for the test dataset showing the classification performance of the Vision Transformer (ViT-B/32) model across different classes.