Table of Contents
Fetching ...

Advancing Ocean State Estimation with efficient and scalable AI

Yanfei Xiang, Yuan Gao, Hao Wu, Quan Zhang, Ruiqi Shu, Xiao Zhou, Xi Wu, Xiaomeng Huang

TL;DR

This work tackles the challenge of accurate, scalable global ocean state estimation under data fidelity constraints by introducing ADAF-Ocean, an AI-driven data assimilation framework that directly ingests multi-source, multi-scale observations without interpolation. Leveraging a Neural Process–inspired encoder–decoder, ADAF-Ocean fuses observations and background fields to produce high-fidelity analyses and enables AI-driven super-resolution from 1-degree to 0.25-degree with a modest parameter increase. When coupled with a DL forecasting model, the framework yields up to ~20 days of extended forecast skill and demonstrates strong reconstruction of high-frequency mesoscale dynamics via spectral analyses and targeted regional studies. The approach offers a computationally viable, scientifically rigorous path toward real-time, high-resolution Earth system monitoring and provides actionable insights for observation-network optimization.

Abstract

Accurate and efficient global ocean state estimation remains a grand challenge for Earth system science, hindered by the dual bottlenecks of computational scalability and degraded data fidelity in traditional data assimilation (DA) and deep learning (DL) approaches. Here we present an AI-driven Data Assimilation Framework for Ocean (ADAF-Ocean) that directly assimilates multi-source and multi-scale observations, ranging from sparse in-situ measurements to 4 km satellite swaths, without any interpolation or data thinning. Inspired by Neural Processes, ADAF-Ocean learns a continuous mapping from heterogeneous inputs to ocean states, preserving native data fidelity. Through AI-driven super-resolution, it reconstructs 0.25$^\circ$ mesoscale dynamics from coarse 1$^\circ$ fields, which ensures both efficiency and scalability, with just 3.7\% more parameters than the 1$^\circ$ configuration. When coupled with a DL forecasting system, ADAF-Ocean extends global forecast skill by up to 20 days compared to baselines without assimilation. This framework establishes a computationally viable and scientifically rigorous pathway toward real-time, high-resolution Earth system monitoring.

Advancing Ocean State Estimation with efficient and scalable AI

TL;DR

This work tackles the challenge of accurate, scalable global ocean state estimation under data fidelity constraints by introducing ADAF-Ocean, an AI-driven data assimilation framework that directly ingests multi-source, multi-scale observations without interpolation. Leveraging a Neural Process–inspired encoder–decoder, ADAF-Ocean fuses observations and background fields to produce high-fidelity analyses and enables AI-driven super-resolution from 1-degree to 0.25-degree with a modest parameter increase. When coupled with a DL forecasting model, the framework yields up to ~20 days of extended forecast skill and demonstrates strong reconstruction of high-frequency mesoscale dynamics via spectral analyses and targeted regional studies. The approach offers a computationally viable, scientifically rigorous path toward real-time, high-resolution Earth system monitoring and provides actionable insights for observation-network optimization.

Abstract

Accurate and efficient global ocean state estimation remains a grand challenge for Earth system science, hindered by the dual bottlenecks of computational scalability and degraded data fidelity in traditional data assimilation (DA) and deep learning (DL) approaches. Here we present an AI-driven Data Assimilation Framework for Ocean (ADAF-Ocean) that directly assimilates multi-source and multi-scale observations, ranging from sparse in-situ measurements to 4 km satellite swaths, without any interpolation or data thinning. Inspired by Neural Processes, ADAF-Ocean learns a continuous mapping from heterogeneous inputs to ocean states, preserving native data fidelity. Through AI-driven super-resolution, it reconstructs 0.25 mesoscale dynamics from coarse 1 fields, which ensures both efficiency and scalability, with just 3.7\% more parameters than the 1 configuration. When coupled with a DL forecasting system, ADAF-Ocean extends global forecast skill by up to 20 days compared to baselines without assimilation. This framework establishes a computationally viable and scientifically rigorous pathway toward real-time, high-resolution Earth system monitoring.

Paper Structure

This paper contains 23 sections, 12 equations, 10 figures, 1 table.

Figures (10)

  • Figure 1: Overall performance of analysis for 5 surface ocean variables.a, b Comparison of ADAF-Ocean-Full, ADAF-Ocean-Thinned, and ADAF-Thinned analysis in terms of ACC improvement and RMSE reduction. c Monthly latitude-weighted RMSE trends for the background and analysis produced by ADAF-Ocean-Full, along with RMSE reduction for the analysis. d Spatial MAE reduction ($\text{MAE}_{\text{analysis}} - \text{MAE}_{\text{background}}$). Negative values (blue) indicate a reduction in MAE achieved by ADAF-Ocean.
  • Figure 2: AI-Driven Super-Resolution Reconstruction and Physical Fidelity. The figure demonstrates the ability of ADAF-Ocean to reconstruct fine-scale features compared to linear interpolation (Interp). a Power Spectral Density (PSD) analysis from a representative test day across the global domain for T, S, Stream speed ($\sqrt{U^2 + V^2}$), and SSH, showing the superior capture of high-wavenumber (small-scale) features by ADAF-Ocean. b-e Case studies in the highly dynamic Kuroshio region, showing the target from GLORYS reanalysis, $0.25^\circ$ analysis produced by ADAF-Ocean and Interp, and the resulting MAE reduction ($\text{MAE}_{\text{ADAF-Ocean}} - \text{MAE}_{\text{Interp}}$). Negative values (blue) indicate lower MAE achieved by ADAF-Ocean compared to Interp.
  • Figure 3: Global forecast after a single DA.a Time-averaged, latitude-weighted RMSE reduction during a 27-day forecast, comparing three types of initial conditions (ICs): ADAF-Ocean-Full (red), ADAF-Ocean-Thinned (blue), and ADAF-Thinned (green), relative to forecasts initialized with baseline GLORYS (w/o DA). More negative values (larger reduction) indicate better forecast skill compared to the baseline. b GLORYS used as a reference target. c 15-day forecast for T, S, Stream speed ($\sqrt{U^2+V^2}$), and SSH using ADAF-Ocean-Full IC. Bias maps of the 15-day forecast using ADAF-Ocean-Full IC (d) and GLORYS IC (e).
  • Figure 4: The model architecture.a Overview of the DA process: Observations ($\mathbf{O}$) and background fields ($\mathbf{B}$) are integrated to generate the analysis ($\mathbf{y}^a$). b The point-based Encoder ($E$) processes coordinates and values ($\mathbf{x}, \mathbf{y}$) into the latent feature $\mathbf{R}$. c The Decoder ($\mathbf{D}$) uses the fused latent feature ($\mathbf{R}^c$) and query coordinates ($\mathbf{x}^a$) to predict the analysis ($\mathbf{y}^a$). d Specialized encoders ($\{E_j^o\}, E^b$) independently transform multi-source inputs ($\mathbf{O}, \mathbf{B}$) into latent features ($\mathbf{R}^o, \mathbf{R}^b$). e Global partitioning: The domain is split into overlapping regions to ensure continuity and manage computational memory.
  • Figure 5: Monthly ACC and Improvements for 5 Ocean Surface Variables. Solid lines represent the analysis (ADAF-Ocean), dashed lines indicate the background (no DA), and bars show ACC improvements for temperature (T), salinity (S), zonal velocity (U), meridional velocity (V), and sea surface height (SSH).
  • ...and 5 more figures