Table of Contents
Fetching ...

Static and auto-regressive neural emulation of phytoplankton biomass dynamics from physical predictors in the global ocean

Mahima Lakra, Ronan Fablet, Lucas Drumetz, Etienne Pauthenet, Elodie Martinez

TL;DR

This study addresses the challenge of modeling global phytoplankton biomass dynamics by leveraging physical ocean predictors and deep learning. It systematically compares static and auto-regressive neural emulators (CNN, ConvLSTM, AFNO/4CastNet, and UNet) using OC-CCI Chlorophyll-a satellite data, with eight supporting physical drivers, and evaluates performance via EOF-based diagnostics and standard metrics. The UNet architecture emerges as the strongest static emulator, while auto-regressive UNet variants extend forecast skill up to about six months, highlighting a Lyapunov-time–scale limit in Chl dynamics; however, low-frequency amplitude remains challenging to capture. Overall, the work demonstrates that physics-informed neural emulators can reconstruct and forecast phytoplankton dynamics, offering a data-driven tool to monitor ocean health and support ecosystem management in a changing climate, with direct applicability to long-term satellite data fusion and short-term forecasting.

Abstract

Phytoplankton is the basis of marine food webs, driving both ecological processes and global biogeochemical cycles. Despite their ecological and climatic significance, accurately simulating phytoplankton dynamics remains a major challenge for biogeochemical numerical models due to limited parameterizations, sparse observational data, and the complexity of oceanic processes. Here, we explore how deep learning models can be used to address these limitations predicting the spatio-temporal distribution of phytoplankton biomass in the global ocean based on satellite observations and environmental conditions. First, we investigate several deep learning architectures. Among the tested models, the UNet architecture stands out for its ability to reproduce the seasonal and interannual patterns of phytoplankton biomass more accurately than other models like CNNs, ConvLSTM, and 4CastNet. When using one to two months of environmental data as input, UNet performs better, although it tends to underestimate the amplitude of low-frequency changes in phytoplankton biomass. Thus, to improve predictions over time, an auto-regressive version of UNet was also tested, where the model uses its own previous predictions to forecast future conditions. This approach works well for short-term forecasts (up to five months), though its performance decreases for longer time scales. Overall, our study shows that combining ocean physical predictors with deep learning allows for reconstruction and short-term prediction of phytoplankton dynamics. These models could become powerful tools for monitoring ocean health and supporting marine ecosystem management, especially in the context of climate change.

Static and auto-regressive neural emulation of phytoplankton biomass dynamics from physical predictors in the global ocean

TL;DR

This study addresses the challenge of modeling global phytoplankton biomass dynamics by leveraging physical ocean predictors and deep learning. It systematically compares static and auto-regressive neural emulators (CNN, ConvLSTM, AFNO/4CastNet, and UNet) using OC-CCI Chlorophyll-a satellite data, with eight supporting physical drivers, and evaluates performance via EOF-based diagnostics and standard metrics. The UNet architecture emerges as the strongest static emulator, while auto-regressive UNet variants extend forecast skill up to about six months, highlighting a Lyapunov-time–scale limit in Chl dynamics; however, low-frequency amplitude remains challenging to capture. Overall, the work demonstrates that physics-informed neural emulators can reconstruct and forecast phytoplankton dynamics, offering a data-driven tool to monitor ocean health and support ecosystem management in a changing climate, with direct applicability to long-term satellite data fusion and short-term forecasting.

Abstract

Phytoplankton is the basis of marine food webs, driving both ecological processes and global biogeochemical cycles. Despite their ecological and climatic significance, accurately simulating phytoplankton dynamics remains a major challenge for biogeochemical numerical models due to limited parameterizations, sparse observational data, and the complexity of oceanic processes. Here, we explore how deep learning models can be used to address these limitations predicting the spatio-temporal distribution of phytoplankton biomass in the global ocean based on satellite observations and environmental conditions. First, we investigate several deep learning architectures. Among the tested models, the UNet architecture stands out for its ability to reproduce the seasonal and interannual patterns of phytoplankton biomass more accurately than other models like CNNs, ConvLSTM, and 4CastNet. When using one to two months of environmental data as input, UNet performs better, although it tends to underestimate the amplitude of low-frequency changes in phytoplankton biomass. Thus, to improve predictions over time, an auto-regressive version of UNet was also tested, where the model uses its own previous predictions to forecast future conditions. This approach works well for short-term forecasts (up to five months), though its performance decreases for longer time scales. Overall, our study shows that combining ocean physical predictors with deep learning allows for reconstruction and short-term prediction of phytoplankton dynamics. These models could become powerful tools for monitoring ocean health and supporting marine ecosystem management, especially in the context of climate change.
Paper Structure (17 sections, 2 equations, 9 figures, 4 tables)

This paper contains 17 sections, 2 equations, 9 figures, 4 tables.

Figures (9)

  • Figure 1: The UNet architecture is represented as an encoder-decoder architecture. The input state with size of $W \times L \times 8$, where dimension $8$ represents the number of physical predictors, the UNet generates outputs of size $W \times L \times 1$. Left: static (One-to-One architecture). Right: Auto-Regressive architecture. The weights of the UNet block are tied, i.e. the same parameters are used in every UNet block for each case.
  • Figure 2: Scatterplots of log(Chl$_{Sat}$) vs. reconstructed log(Chl) from the CNN (first column) and UNet (right column), across different oceanic basins over [2012-2017]
  • Figure 3: Correlation and NRMSE maps between Chl$_{Sat}$ vs. reconstructed Chl from CNN over the [2012--2017] testing time period. Correlation and NRMSE differences between ConvLSTM, 4CastNet, and UNet compared to CNN. For all the panels, the best results and improvements are in blue.
  • Figure 4: EOF analysis based on seasonal and non-seasonal Chl$_{Sat}$ over the tested time period [2012--2017] and their projections on reconstructed Chl from the four static models. Panels (a),(c),(e),(g) show spatial patterns; (b),(d),(f),(h) show the corresponding principal components (PCs). EOF in (e),(f) is calculated on the non-seasonal signal over [2002--2018].
  • Figure 5: Trade-off between forecast accuracy and lead time for various parameterizations within an auto-regressive approach. It visually compares the RMSE achieved by UNet parameterization configurations at varying lead times (how far into the future the forecast predicts).
  • ...and 4 more figures