Reconstructing the Tropical Pacific Upper Ocean using Online Data Assimilation with a Deep Learning model

Zilu Meng; Gregory J. Hakim

Reconstructing the Tropical Pacific Upper Ocean using Online Data Assimilation with a Deep Learning model

Zilu Meng, Gregory J. Hakim

TL;DR

This work develops and tests a transformer-based deep learning (DL) model for forecasting and reconstructing tropical Pacific upper-ocean states from sparse coral proxies, comparing it against a Linear Inverse Model (LIM). Trained on CMIP6 historical data and validated with SODA and GODAS reanalysis, the DL model forecasts 12 months ahead and forms priors for Ensemble Kalman Filter data assimilation; to address forecast variance loss, a inflation scheme injects hindcast-derived noise scaled by inter-model envelope. Across online assimilation experiments with 24 pseudo-proxies, DL yields higher domain-averaged correlations and better Nino3.4 reconstructions than LIM and offline baselines, with gains increasing as observation-averaging time grows. The results support DL as a computationally efficient priors approach for online paleoclimate data assimilation and motivate applying the method to real proxy data and broader regions.

Abstract

A deep learning (DL) model, based on a transformer architecture, is trained on a climate-model dataset and compared with a standard linear inverse model (LIM) in the tropical Pacific. We show that the DL model produces more accurate forecasts compared to the LIM when tested on a reanalysis dataset. We then assess the ability of an ensemble Kalman filter to reconstruct the monthly-averaged upper ocean from a noisy set of 24 sea-surface temperature observations designed to mimic existing coral proxy measurements, and compare results for the DL model and LIM. Due to signal damping in the DL model, we implement a novel inflation technique by adding noise from hindcast experiments. Results show that assimilating observations with the DL model yields better reconstructions than the LIM for observation averaging times ranging from one month to one year. The improved reconstruction is due to the enhanced predictive capabilities of the DL model, which map the memory of past observations to future assimilation times.

Reconstructing the Tropical Pacific Upper Ocean using Online Data Assimilation with a Deep Learning model

TL;DR

Abstract

Paper Structure (15 sections, 22 equations, 14 figures, 1 table)

This paper contains 15 sections, 22 equations, 14 figures, 1 table.

Introduction
Data, Models, and Data Assimilation Methods
Data
Models
Linear Inverse Model (LIM)
Deep Learning Model (DL)
Model Training and Configuration
Data Assimilation Methods
Ensemble, Online and Offline Assimilation
DL model Ensemble Inflation
Evaluation Criteria
Observing Network
Forecasting Results
Data Assimilation Results
Conclusion and Discussion

Figures (14)

Figure 1: Variance proportion of deep learning forecasts compared to GODAS observations across different variables over time. Illustrated here are the variance ratios for predictions made by the Deep Learning model relative to actual observations from the GODAS dataset, covering variables such as the Nino3.4 Index, Sea Surface Temperature (SST), wind stress, and ocean temperature, as a function of varying lead times in months.
Figure 2: Comparison of standard deviation ratios: SODA vs. CMIP6 Models for the Nino3.4 Index. This graph displays the ratio of the Nino3.4 index standard deviation from the SODA dataset to that of various CMIP6 models, which is utilized as a scaling factor for noise based on the CMIP6 model data.
Figure 3: Geographical distribution of coral $\delta^{18}O$ proxy records in the tropical Pacific region from the PAGES2K database (Outlined by the dashed blue box). The dashed blue box delineates the area of focused modeling and interest, the dashed red box indicates the Nino3.4 region, and the brownish-yellow dots represent the locations of coral $\delta^{18}O$ proxy records.
Figure 4: Forecast skill of Deep Learning model (red), Linear Inverse Model (blue) & LIMOsst (green) in terms of correlation (solid lines) and RMSE (dashed lines) across lead time for the Nino3.4 Index (a), Sea Surface Temperature (SST) field (b), ocean temperature field (c), and the wind stress field (d). The correlation scale is provided on the left $y$-axis and RMSE on the right $y$-axis, as the function of forecast lead time.
Figure 5: Difference in forecast skill as measured by correlation between Deep Learning and Linear Inverse Models as a function of lead time ($\tau$). The first row (a–d) shows zonal-wind stress, the second row (e–h) meridional wind stress, the third row (i–l) Sea Surface Temperature (SST), and fourth row (m–p) equatorial (5°N–5°S) ocean temperature.
...and 9 more figures

Reconstructing the Tropical Pacific Upper Ocean using Online Data Assimilation with a Deep Learning model

TL;DR

Abstract

Reconstructing the Tropical Pacific Upper Ocean using Online Data Assimilation with a Deep Learning model

Authors

TL;DR

Abstract

Table of Contents

Figures (14)