Table of Contents
Fetching ...

Atmospheric Transport Modeling of CO$_2$ with Neural Networks

Vitus Benson, Ana Bastos, Christian Reimers, Alexander J. Winkler, Fanny Yang, Markus Reichstein

TL;DR

This work introduces CarbonBench, a systematic ML benchmark for Eulerian atmospheric tracer transport, and benchmarks four neural architectures (UNet, GraphCast, SFNO, SwinTransformer) on CO2 transport emulation with physics-informed adjustments. A SwinTransformer-based emulator, enhanced by CentFlux, SpecLoss, and Massfixer, achieves near-perfect 90-day forecasts ($R^2$ > 0.99, RMSE < 1 ppm) and remains stable for multi-year rollouts, while all four models can conserve mass and maintain stability for extended periods. The study demonstrates the feasibility of neural network emulators for forward and inverse modeling of inert tracers at high resolution, highlighting practical implications for MRV, G3W, and policy-relevant climate monitoring, as well as avenues for future multi-resolution and differentiable inverse modeling. It also discusses trade-offs between accuracy, mass conservation, and computational costs, and outlines paths toward integrating AI emulators into operational atmospheric transport workflows.

Abstract

Accurately describing the distribution of CO$_2$ in the atmosphere with atmospheric tracer transport models is essential for greenhouse gas monitoring and verification support systems to aid implementation of international climate agreements. Large deep neural networks are poised to revolutionize weather prediction, which requires 3D modeling of the atmosphere. While similar in this regard, atmospheric transport modeling is subject to new challenges. Both, stable predictions for longer time horizons and mass conservation throughout need to be achieved, while IO plays a larger role compared to computational costs. In this study we explore four different deep neural networks (UNet, GraphCast, Spherical Fourier Neural Operator and SwinTransformer) which have proven as state-of-the-art in weather prediction to assess their usefulness for atmospheric tracer transport modeling. For this, we assemble the CarbonBench dataset, a systematic benchmark tailored for machine learning emulators of Eulerian atmospheric transport. Through architectural adjustments, we decouple the performance of our emulators from the distribution shift caused by a steady rise in atmospheric CO$_2$. More specifically, we center CO$_2$ input fields to zero mean and then use an explicit flux scheme and a mass fixer to assure mass balance. This design enables stable and mass conserving transport for over 6 months with all four neural network architectures. In our study, the SwinTransformer displays particularly strong emulation skill (90-day $R^2 > 0.99$), with physically plausible emulation even for forward runs of multiple years. This work paves the way forward towards high resolution forward and inverse modeling of inert trace gases with neural networks.

Atmospheric Transport Modeling of CO$_2$ with Neural Networks

TL;DR

This work introduces CarbonBench, a systematic ML benchmark for Eulerian atmospheric tracer transport, and benchmarks four neural architectures (UNet, GraphCast, SFNO, SwinTransformer) on CO2 transport emulation with physics-informed adjustments. A SwinTransformer-based emulator, enhanced by CentFlux, SpecLoss, and Massfixer, achieves near-perfect 90-day forecasts ( > 0.99, RMSE < 1 ppm) and remains stable for multi-year rollouts, while all four models can conserve mass and maintain stability for extended periods. The study demonstrates the feasibility of neural network emulators for forward and inverse modeling of inert tracers at high resolution, highlighting practical implications for MRV, G3W, and policy-relevant climate monitoring, as well as avenues for future multi-resolution and differentiable inverse modeling. It also discusses trade-offs between accuracy, mass conservation, and computational costs, and outlines paths toward integrating AI emulators into operational atmospheric transport workflows.

Abstract

Accurately describing the distribution of CO in the atmosphere with atmospheric tracer transport models is essential for greenhouse gas monitoring and verification support systems to aid implementation of international climate agreements. Large deep neural networks are poised to revolutionize weather prediction, which requires 3D modeling of the atmosphere. While similar in this regard, atmospheric transport modeling is subject to new challenges. Both, stable predictions for longer time horizons and mass conservation throughout need to be achieved, while IO plays a larger role compared to computational costs. In this study we explore four different deep neural networks (UNet, GraphCast, Spherical Fourier Neural Operator and SwinTransformer) which have proven as state-of-the-art in weather prediction to assess their usefulness for atmospheric tracer transport modeling. For this, we assemble the CarbonBench dataset, a systematic benchmark tailored for machine learning emulators of Eulerian atmospheric transport. Through architectural adjustments, we decouple the performance of our emulators from the distribution shift caused by a steady rise in atmospheric CO. More specifically, we center CO input fields to zero mean and then use an explicit flux scheme and a mass fixer to assure mass balance. This design enables stable and mass conserving transport for over 6 months with all four neural network architectures. In our study, the SwinTransformer displays particularly strong emulation skill (90-day ), with physically plausible emulation even for forward runs of multiple years. This work paves the way forward towards high resolution forward and inverse modeling of inert trace gases with neural networks.
Paper Structure (30 sections, 4 equations, 23 figures, 1 table)

This paper contains 30 sections, 4 equations, 23 figures, 1 table.

Figures (23)

  • Figure 1: Offline atmospheric tracer transport modeling with deep neural networks.
  • Figure 2: Conceptual depiction of the four deep neural networks included in this study.
  • Figure 3: Intercomparison between the best models per architecture. In blue (a&b), the performance is evaluated by scoring the global predicted 3D field against the ground truth CO2 field from the test period of the LowRes dataset -- this allows for comparisons between the AI models. In orange (c&d), the performance is evaluated at ObsPack stations. This allows, in addition, to compare against TM5 (dashed black lines), the transport model used to produce the ground truth dataset. At ObsPack stations, in addition to the mean scores, we also display uncertainty estimates: the std. dev. over stations scaled by the square root of the number of stations. Local $R^2$ (c) and global (b) and local RMSE (d) are computed for quarterly 90-day forward runs, the decorrelation time (a) is estimated from a single 3 year forward run.
  • Figure 4: Key metrics per vertical layer for quarterly forecasts over the test set for SwinTransformer. We report metrics per time step and vertical level, i.e. they represent properties of the 2D maps of atmospheric CO2 mass mixing ratios at different vertical levels. The metrics are averaged over quarterly reset 90-day forward runs. Dashed lines indicate arbitrarily set thresholds which subjectively signify stable simulation (e.g. RMSE $<$ 1 ppm is a goal for many CO2 MRV systems).
  • Figure 5: Maps of Total Column CO2 Target, Prediction by SwinTransformer and Error for different lead times. Shown is a single forward run starting from Jan 1st, 2018.
  • ...and 18 more figures