Table of Contents
Fetching ...

Iterative Encoding-Decoding VAEs Anomaly Detection in NOAA's DART Time Series: A Machine Learning Approach for Enhancing Data Integrity for NASA's GRACE-FO Verification and Validation

Kevin Lee

TL;DR

NOAA DART time-series suffer spikes, steps, and drifts that impede tsunami detection and GRACE-FO validation. The authors introduce Iterative Encoding-Decoding VAEs, a multi-iteration VAE framework that refines latent representations and reconstructions to remove anomalies while preserving genuine ocean signals, aided by a hybrid thresholding scheme and multi-scale analysis. Applied to a challenging Station 23461 dataset from 2022, the method yields cleaner reconstructions with improved residual characteristics, rate-of-change detection, and latent-space separation, outperforming classical techniques. The approach is GPU-accelerated, memory-efficient, and designed to support V&V workflows for GRACE-FO and future climate-modeling efforts by delivering interpretable, high-integrity time-series data.

Abstract

NOAA's Deep-ocean Assessment and Reporting of Tsunamis (DART) data are critical for NASA-JPL's tsunami detection, real-time operations, and oceanographic research. However, these time-series data often contain spikes, steps, and drifts that degrade data quality and obscure essential oceanographic features. To address these anomalies, the work introduces an Iterative Encoding-Decoding Variational Autoencoders (Iterative Encoding-Decoding VAEs) model to improve the quality of DART time series. Unlike traditional filtering and thresholding methods that risk distorting inherent signal characteristics, Iterative Encoding-Decoding VAEs progressively remove anomalies while preserving the data's latent structure. A hybrid thresholding approach further retains genuine oceanographic features near boundaries. Applied to complex DART datasets, this approach yields reconstructions that better maintain key oceanic properties compared to classical statistical techniques, offering improved robustness against spike removal and subtle step changes. The resulting high-quality data supports critical verification and validation efforts for the GRACE-FO mission at NASA-JPL, where accurate surface measurements are essential to modeling Earth's gravitational field and global water dynamics. Ultimately, this data processing method enhances tsunami detection and underpins future climate modeling with improved interpretability and reliability.

Iterative Encoding-Decoding VAEs Anomaly Detection in NOAA's DART Time Series: A Machine Learning Approach for Enhancing Data Integrity for NASA's GRACE-FO Verification and Validation

TL;DR

NOAA DART time-series suffer spikes, steps, and drifts that impede tsunami detection and GRACE-FO validation. The authors introduce Iterative Encoding-Decoding VAEs, a multi-iteration VAE framework that refines latent representations and reconstructions to remove anomalies while preserving genuine ocean signals, aided by a hybrid thresholding scheme and multi-scale analysis. Applied to a challenging Station 23461 dataset from 2022, the method yields cleaner reconstructions with improved residual characteristics, rate-of-change detection, and latent-space separation, outperforming classical techniques. The approach is GPU-accelerated, memory-efficient, and designed to support V&V workflows for GRACE-FO and future climate-modeling efforts by delivering interpretable, high-integrity time-series data.

Abstract

NOAA's Deep-ocean Assessment and Reporting of Tsunamis (DART) data are critical for NASA-JPL's tsunami detection, real-time operations, and oceanographic research. However, these time-series data often contain spikes, steps, and drifts that degrade data quality and obscure essential oceanographic features. To address these anomalies, the work introduces an Iterative Encoding-Decoding Variational Autoencoders (Iterative Encoding-Decoding VAEs) model to improve the quality of DART time series. Unlike traditional filtering and thresholding methods that risk distorting inherent signal characteristics, Iterative Encoding-Decoding VAEs progressively remove anomalies while preserving the data's latent structure. A hybrid thresholding approach further retains genuine oceanographic features near boundaries. Applied to complex DART datasets, this approach yields reconstructions that better maintain key oceanic properties compared to classical statistical techniques, offering improved robustness against spike removal and subtle step changes. The resulting high-quality data supports critical verification and validation efforts for the GRACE-FO mission at NASA-JPL, where accurate surface measurements are essential to modeling Earth's gravitational field and global water dynamics. Ultimately, this data processing method enhances tsunami detection and underpins future climate modeling with improved interpretability and reliability.

Paper Structure

This paper contains 65 sections, 54 equations, 4 figures, 2 algorithms.

Figures (4)

  • Figure 1: DART Water Level Data, 23461t2022: Original vs. Cleaned Data
  • Figure 2: Time Series Residuals and Detected Anomalies
  • Figure 3: Training History
  • Figure 4: Latent Space Visualization: Clustering of Normal vs. Anomalous Patterns