Table of Contents
Fetching ...

Large Language Models for Limited Noisy Data: A Gravitational Wave Identification Study

Yixuan Li, Yuhao Lu, Yang Liu, Liang Li, R. Ruffini, Di Li, Rong-Gen Cai, Xiaoyan Zhu, Wenbin Lin, Yu Wang

TL;DR

To address identifying gravitational wave signals in non-Gaussian, non-stationary detector noise with limited labeled data, the authors evaluate large language models trained directly on observational data. They convert time-series data into time-frequency patch tokens and fine-tune an 8B-parameter LLM (Meta-Llama-3-8B-Instruct) using LoRA, achieving 97.4% recall on held-out GW segments without simulated injections. They show that adding large simulated datasets provides negligible gains, while increasing model size yields predictable improvements that converge around 8B parameters; dataset size also boosts performance, with diminishing returns at large scales. The results imply LLMs can efficiently extract global, coherent patterns from complex astronomical data and may generalize to other domains with similar noise characteristics.

Abstract

This work investigates whether large language models (LLMs) offer advantages over traditional neural networks for astronomical data processing, in regimes with non-Gaussian, non-stationary noise and limited labeled samples. Gravitational wave observations provide an suitable test case, using only 90 LIGO events, finetuned LLMs achieve 97.4\% accuracy for identifying signals. Further experiments show that, in contrast to traditional networks that rely on large simulated datasets, additional simulated samples do not improve LLM performance, while scaling studies reveal predictable gains with increasing model size and dataset size. These results indicate that LLMs can extract discriminative structure directly from observational data and provide an efficient assessment for gravitational wave identification. The same strategy may extend to other astronomical domains with similar noise properties, such as radio or pulsar observations.

Large Language Models for Limited Noisy Data: A Gravitational Wave Identification Study

TL;DR

To address identifying gravitational wave signals in non-Gaussian, non-stationary detector noise with limited labeled data, the authors evaluate large language models trained directly on observational data. They convert time-series data into time-frequency patch tokens and fine-tune an 8B-parameter LLM (Meta-Llama-3-8B-Instruct) using LoRA, achieving 97.4% recall on held-out GW segments without simulated injections. They show that adding large simulated datasets provides negligible gains, while increasing model size yields predictable improvements that converge around 8B parameters; dataset size also boosts performance, with diminishing returns at large scales. The results imply LLMs can efficiently extract global, coherent patterns from complex astronomical data and may generalize to other domains with similar noise characteristics.

Abstract

This work investigates whether large language models (LLMs) offer advantages over traditional neural networks for astronomical data processing, in regimes with non-Gaussian, non-stationary noise and limited labeled samples. Gravitational wave observations provide an suitable test case, using only 90 LIGO events, finetuned LLMs achieve 97.4\% accuracy for identifying signals. Further experiments show that, in contrast to traditional networks that rely on large simulated datasets, additional simulated samples do not improve LLM performance, while scaling studies reveal predictable gains with increasing model size and dataset size. These results indicate that LLMs can extract discriminative structure directly from observational data and provide an efficient assessment for gravitational wave identification. The same strategy may extend to other astronomical domains with similar noise properties, such as radio or pulsar observations.

Paper Structure

This paper contains 15 sections, 17 equations, 5 figures.

Figures (5)

  • Figure 1: Comparison between a gravitational wave signal segment and a noise only segment. Top: representation of the GW150914 event. Bottom: Representative noise segment from the same detector band.
  • Figure 2: Identification performance of the finetuned LLM on LIGO observational data. The model, trained on only 90 events, achieves a recall of 97.4% for both signal segments and noise segments. The misclassification rate of 2.6% per class demonstrates that the model maintains balanced and reliable performance despite the strongly non-Gaussian and non-stationary noise present in detector data.
  • Figure 3: Confusion matrix for identification on LIGO observational data using a model pre-finetuned on large-scale simulated samples. The performance is comparable to that reported in Figure \ref{['fig:fig2']}, indicating that simulation-based pre-finetuning does not provide measurable improvement over finetuning on observational data alone.
  • Figure 4: Classification accuracy of LLMs as a function of parameter size. Results are shown for three independent runs of each configuration, the solid line denotes the mean accuracy and the shaded region shows the standard deviation. Accuracy improves steadily as model size increases, and the three architectures converge to similar performance in the 7-8 billion parameter range.
  • Figure 5: Model accuracy as a function of dataset size. Each dataset configuration is evaluated over three independent runs, with all individual results shown. The solid line denotes the mean accuracy and the shaded region shows the standard deviation, illustrating the scalability of LLM performance as training data increases.