Table of Contents
Fetching ...

Spatiotemporal Pyramid Flow Matching for Climate Emulation

Jeremy Andrew Irvin, Jiaqi Han, Zikui Wang, Abdulaziz Alharbi, Yufei Zhao, Nomin-Erdene Bayarsaikhan, Daniele Visioni, Andrew Y. Ng, Duncan Watson-Parris

TL;DR

The paper tackles the challenge of long-horizon, probabilistic climate emulation with expensive Earth system models. It introduces Spatiotemporal Pyramid Flows (SPF), a parallel, multi-timescale flow-matching approach that cascades over space and time and conditions on external forcings, enabling direct sampling at arbitrary future times. By curating ClimateSuite and evaluating on ClimateBench, the authors show SPF achieves superior probabilistic accuracy (CRPS) and faster sampling than strong baselines, and generalizes across climate models and interventions. Together, SPF and ClimateSuite offer a scalable, data-driven foundation for accurate, efficient climate emulation across temporal scales and scenarios, with public data and code to foster further development.

Abstract

Generative models have the potential to transform the way we emulate Earth's changing climate. Previous generative approaches rely on weather-scale autoregression for climate emulation, but this is inherently slow for long climate horizons and has yet to demonstrate stable rollouts under nonstationary forcings. Here, we introduce Spatiotemporal Pyramid Flows (SPF), a new class of flow matching approaches that model data hierarchically across spatial and temporal scales. Inspired by cascaded video models, SPF partitions the generative trajectory into a spatiotemporal pyramid, progressively increasing spatial resolution to reduce computation and coupling each stage with an associated timescale to enable direct sampling at any temporal level in the pyramid. This design, together with conditioning each stage on prescribed physical forcings (e.g., greenhouse gases or aerosols), enables efficient, parallel climate emulation at multiple timescales. On ClimateBench, SPF outperforms strong flow matching baselines and pre-trained models at yearly and monthly timescales while offering fast sampling, especially at coarser temporal levels. To scale SPF, we curate ClimateSuite, the largest collection of Earth system simulations to date, comprising over 33,000 simulation-years across ten climate models and the first dataset to include simulations of climate interventions. We find that the scaled SPF model demonstrates good generalization to held-out scenarios across climate models. Together, SPF and ClimateSuite provide a foundation for accurate, efficient, probabilistic climate emulation across temporal scales and realistic future scenarios. Data and code is publicly available at https://github.com/stanfordmlgroup/spf .

Spatiotemporal Pyramid Flow Matching for Climate Emulation

TL;DR

The paper tackles the challenge of long-horizon, probabilistic climate emulation with expensive Earth system models. It introduces Spatiotemporal Pyramid Flows (SPF), a parallel, multi-timescale flow-matching approach that cascades over space and time and conditions on external forcings, enabling direct sampling at arbitrary future times. By curating ClimateSuite and evaluating on ClimateBench, the authors show SPF achieves superior probabilistic accuracy (CRPS) and faster sampling than strong baselines, and generalizes across climate models and interventions. Together, SPF and ClimateSuite offer a scalable, data-driven foundation for accurate, efficient climate emulation across temporal scales and scenarios, with public data and code to foster further development.

Abstract

Generative models have the potential to transform the way we emulate Earth's changing climate. Previous generative approaches rely on weather-scale autoregression for climate emulation, but this is inherently slow for long climate horizons and has yet to demonstrate stable rollouts under nonstationary forcings. Here, we introduce Spatiotemporal Pyramid Flows (SPF), a new class of flow matching approaches that model data hierarchically across spatial and temporal scales. Inspired by cascaded video models, SPF partitions the generative trajectory into a spatiotemporal pyramid, progressively increasing spatial resolution to reduce computation and coupling each stage with an associated timescale to enable direct sampling at any temporal level in the pyramid. This design, together with conditioning each stage on prescribed physical forcings (e.g., greenhouse gases or aerosols), enables efficient, parallel climate emulation at multiple timescales. On ClimateBench, SPF outperforms strong flow matching baselines and pre-trained models at yearly and monthly timescales while offering fast sampling, especially at coarser temporal levels. To scale SPF, we curate ClimateSuite, the largest collection of Earth system simulations to date, comprising over 33,000 simulation-years across ten climate models and the first dataset to include simulations of climate interventions. We find that the scaled SPF model demonstrates good generalization to held-out scenarios across climate models. Together, SPF and ClimateSuite provide a foundation for accurate, efficient, probabilistic climate emulation across temporal scales and realistic future scenarios. Data and code is publicly available at https://github.com/stanfordmlgroup/spf .

Paper Structure

This paper contains 47 sections, 28 equations, 23 figures, 9 tables.

Figures (23)

  • Figure 2: SPF flow trajectory. SPF divides generation into stages, each beginning with DiT denoising and followed by either a spatiotemporal transition (green) or a spatial-only transition (orange). Spatiotemporal transitions funnel into a timestep for the selected target period and upsample the latent in both space and time, while spatial transitions upsample only in space. This sequence of denoising and stage transitions continues until the final stage, which outputs clean samples at the target period and timescale.
  • Figure : (a) PyramidalFlow (jin2024pyramidal)
  • Figure S1: Efficiency benefits from caching. For long sequence generation, intermediate latents at coarser timescales can be cached to save compute when generating samples from finer timescales. In this dummy example, to generate a sequence of eight fine timescale samples, the flow in Timescale 1 only needs to be run once and the flow in Timescale 2 only needs to be run twice. Then, only the last part of the flow in Timescale 3 needs to be run when generating each of the eight clean Timescale 3 samples as the flow can resume from the Timescale 2 latents.
  • Figure S2: Monthly latitude-weighted global means of the 200M SPF on ClimateBench SSP2-4.5. Temperature shown in the left three columns (red) and precipitation in the right three columns (blue).
  • Figure S3: Yearly latitude-weighted global means of the 200M SPF on ClimateBench SSP2-4.5. Temperature shown in the left column (red) and precipitation in the right column (blue).
  • ...and 18 more figures