ClimSim-Online: A Large Multi-scale Dataset and Framework for Hybrid ML-physics Climate Emulation

Sungduk Yu; Zeyuan Hu; Akshay Subramaniam; Walter Hannah; Liran Peng; Jerry Lin; Mohamed Aziz Bhouri; Ritwik Gupta; Björn Lütjens; Justus C. Will; Gunnar Behrens; Julius J. M. Busecke; Nora Loose; Charles I. Stern; Tom Beucler; Bryce Harrop; Helge Heuer; Benjamin R. Hillman; Andrea Jenney; Nana Liu; Alistair White; Tian Zheng; Zhiming Kuang; Fiaz Ahmed; Elizabeth Barnes; Noah D. Brenowitz; Christopher Bretherton; Veronika Eyring; Savannah Ferretti; Nicholas Lutsko; Pierre Gentine; Stephan Mandt; J. David Neelin; Rose Yu; Laure Zanna; Nathan Urban; Janni Yuval; Ryan Abernathey; Pierre Baldi; Wayne Chuang; Yu Huang; Fernando Iglesias-Suarez; Sanket Jantre; Po-Lun Ma; Sara Shamekh; Guang Zhang; Michael Pritchard

ClimSim-Online: A Large Multi-scale Dataset and Framework for Hybrid ML-physics Climate Emulation

Sungduk Yu, Zeyuan Hu, Akshay Subramaniam, Walter Hannah, Liran Peng, Jerry Lin, Mohamed Aziz Bhouri, Ritwik Gupta, Björn Lütjens, Justus C. Will, Gunnar Behrens, Julius J. M. Busecke, Nora Loose, Charles I. Stern, Tom Beucler, Bryce Harrop, Helge Heuer, Benjamin R. Hillman, Andrea Jenney, Nana Liu, Alistair White, Tian Zheng, Zhiming Kuang, Fiaz Ahmed, Elizabeth Barnes, Noah D. Brenowitz, Christopher Bretherton, Veronika Eyring, Savannah Ferretti, Nicholas Lutsko, Pierre Gentine, Stephan Mandt, J. David Neelin, Rose Yu, Laure Zanna, Nathan Urban, Janni Yuval, Ryan Abernathey, Pierre Baldi, Wayne Chuang, Yu Huang, Fernando Iglesias-Suarez, Sanket Jantre, Po-Lun Ma, Sara Shamekh, Guang Zhang, Michael Pritchard

TL;DR

ClimSim-Online tackles the challenge of simulating subgrid-scale atmospheric processes by providing the largest multi-scale climate dataset and an end-to-end pipeline for developing and evaluating ML-based parameterizations in hybrid ML-physics climate models. By coupling high-fidelity cloud-resolving simulations with host climate dynamics through a containerized workflow and TorchScript-based Fortran integration, the work enables systematic offline training and online testing of emulators. The study demonstrates baseline ML architectures, shows how expanded inputs/targets improve skill, and highlights physics-informed constraints that stabilize online coupling, achieving notable reductions in online error compared to purely data-driven approaches. The framework and dataset aim to lower entry barriers for ML researchers, support reproducibility, and pave the way for operationally relevant hybrid climate simulations that can scale to future climate states and more comprehensive Earth system components.

Abstract

Modern climate projections lack adequate spatial and temporal resolution due to computational constraints, leading to inaccuracies in representing critical processes like thunderstorms that occur on the sub-resolution scale. Hybrid methods combining physics with machine learning (ML) offer faster, higher fidelity climate simulations by outsourcing compute-hungry, high-resolution simulations to ML emulators. However, these hybrid ML-physics simulations require domain-specific data and workflows that have been inaccessible to many ML experts. As an extension of the ClimSim dataset (Yu et al., 2024), we present ClimSim-Online, which also includes an end-to-end workflow for developing hybrid ML-physics simulators. The ClimSim dataset includes 5.7 billion pairs of multivariate input/output vectors, capturing the influence of high-resolution, high-fidelity physics on a host climate simulator's macro-scale state. The dataset is global and spans ten years at a high sampling frequency. We provide a cross-platform, containerized pipeline to integrate ML models into operational climate simulators for hybrid testing. We also implement various ML baselines, alongside a hybrid baseline simulator, to highlight the ML challenges of building stable, skillful emulators. The data (https://huggingface.co/datasets/LEAP/ClimSim_high-res) and code (https://leap-stc.github.io/ClimSim and https://github.com/leap-stc/climsim-online) are publicly released to support the development of hybrid ML-physics and high-fidelity climate simulations.

ClimSim-Online: A Large Multi-scale Dataset and Framework for Hybrid ML-physics Climate Emulation

TL;DR

Abstract

Paper Structure (18 sections, 1 equation, 3 figures, 2 tables)

This paper contains 18 sections, 1 equation, 3 figures, 2 tables.

Introduction
Overview
Concepts and Terminology from Earth Science
Related Work
ClimSim Dataset
Offline Experiments
Baseline Architectures
Offline Skill Boost from Expanding Features and Targets
Evaluation Metrics
Baseline Model Results
Physics-Informed Guidance to Improve Generalizability and Online Performance
Hybrid Testing and Online Performance Evaluation
Software to Integrate ML Models into Physical Climate Simulations
Metrics for Evaluating Online Errors in Hybrid Climate Simulations
Experiment Setup for Hybrid Online Testing
...and 3 more sections

Figures (3)

Figure 1: The spatially-local version of ClimSim that our baselines are scored on. A spatially-global version of the problem that expands to the full list of variables would be useful to try.
Figure 2: (a) Summary, where $dT/dt$ and $dq/dt$ are the tendencies of temperature and specific humidity, respectively, and were vertically integrated with mass weighting. (b,c) retain the vertical structure of MAE and (d,e) R$^\text{2}$. Error bars and grey shadings show the the 5- to 95-percentile range of MLP. Refer to Table \ref{['tab:baselinevars']} for variable definitions.
Figure 3: (a) Offline R$^2$ scores across various variables for MLP, U-Net, and U-Net with physics constraints. Variables are the full target variables listed in Table S1, including temperature tendency ($\frac{dT}{dt}$), water vapor tendency ($\frac{dQ_{\text{v}}}{dt}$), liquid cloud mixing ratio tendency ($\frac{dQ_{\text{c}}}{dt}$), ice cloud mixing ratio tendency ($\frac{dQ_{\text{i}}}{dt}$), zonal wind tendency ($\frac{dU}{dt}$), meridional wind tendency ($\frac{dV}{dt}$), and eight flux variables. (b,c) Online monthly and globally averaged (both horizontally and vertically and weighted by mass in each grid) RMSE of temperature (K) and moisture (g/kg) over a one-year period, comparing baseline MLP, U-Net, and U-Net with physics constraints models against the reference E3SM-MMF simulation. Atmospheric unpredictability (black dashed lines) is estimated by running the reference E3SM-MMF simulations multiple times with the same initial condition while allowing for the chaotic growth of the random rounding errors.

ClimSim-Online: A Large Multi-scale Dataset and Framework for Hybrid ML-physics Climate Emulation

TL;DR

Abstract

ClimSim-Online: A Large Multi-scale Dataset and Framework for Hybrid ML-physics Climate Emulation

Authors

TL;DR

Abstract

Table of Contents

Figures (3)