Table of Contents
Fetching ...

CirT: Global Subseasonal-to-Seasonal Forecasting with Geometry-inspired Transformer

Yang Liu, Zinan Zheng, Jiashun Cheng, Fugee Tsung, Deli Zhao, Yu Rong, Jia Li

TL;DR

Subseasonal-to-Seasonal forecasting remains challenging due to atmospheric chaos and geometric distortions when treating the globe as a planar image. CirT addresses this by partitioning the graticule into equidistant latitudinal circular patches and performing self-attention in the frequency domain via a Fourier transform, enabling global, periodic spatial coupling. Direct biweekly prediction on ERA5 demonstrates CirT outperforms state-of-the-art data-driven models and skillful numerical systems, with ablations validating the importance of circular patching and frequency-domain mixing. The approach offers a geometry-aware pathway for robust global S2S forecasting and points to extensions incorporating vertical coupling and slow-evolving Earth system components.

Abstract

Accurate Subseasonal-to-Seasonal (S2S) climate forecasting is pivotal for decision-making including agriculture planning and disaster preparedness but is known to be challenging due to its chaotic nature. Although recent data-driven models have shown promising results, their performance is limited by inadequate consideration of geometric inductive biases. Usually, they treat the spherical weather data as planar images, resulting in an inaccurate representation of locations and spatial relations. In this work, we propose the geometric-inspired Circular Transformer (CirT) to model the cyclic characteristic of the graticule, consisting of two key designs: (1) Decomposing the weather data by latitude into circular patches that serve as input tokens to the Transformer; (2) Leveraging Fourier transform in self-attention to capture the global information and model the spatial periodicity. Extensive experiments on the Earth Reanalysis 5 (ERA5) reanalysis dataset demonstrate our model yields a significant improvement over the advanced data-driven models, including PanguWeather and GraphCast, as well as skillful ECMWF systems. Additionally, we empirically show the effectiveness of our model designs and high-quality prediction over spatial and temporal dimensions.

CirT: Global Subseasonal-to-Seasonal Forecasting with Geometry-inspired Transformer

TL;DR

Subseasonal-to-Seasonal forecasting remains challenging due to atmospheric chaos and geometric distortions when treating the globe as a planar image. CirT addresses this by partitioning the graticule into equidistant latitudinal circular patches and performing self-attention in the frequency domain via a Fourier transform, enabling global, periodic spatial coupling. Direct biweekly prediction on ERA5 demonstrates CirT outperforms state-of-the-art data-driven models and skillful numerical systems, with ablations validating the importance of circular patching and frequency-domain mixing. The approach offers a geometry-aware pathway for robust global S2S forecasting and points to extensions incorporating vertical coupling and slow-evolving Earth system components.

Abstract

Accurate Subseasonal-to-Seasonal (S2S) climate forecasting is pivotal for decision-making including agriculture planning and disaster preparedness but is known to be challenging due to its chaotic nature. Although recent data-driven models have shown promising results, their performance is limited by inadequate consideration of geometric inductive biases. Usually, they treat the spherical weather data as planar images, resulting in an inaccurate representation of locations and spatial relations. In this work, we propose the geometric-inspired Circular Transformer (CirT) to model the cyclic characteristic of the graticule, consisting of two key designs: (1) Decomposing the weather data by latitude into circular patches that serve as input tokens to the Transformer; (2) Leveraging Fourier transform in self-attention to capture the global information and model the spatial periodicity. Extensive experiments on the Earth Reanalysis 5 (ERA5) reanalysis dataset demonstrate our model yields a significant improvement over the advanced data-driven models, including PanguWeather and GraphCast, as well as skillful ECMWF systems. Additionally, we empirically show the effectiveness of our model designs and high-quality prediction over spatial and temporal dimensions.

Paper Structure

This paper contains 37 sections, 12 equations, 16 figures, 9 tables.

Figures (16)

  • Figure 1: Planar and the spherical view of 2-metre temperature. Treating it as an image results in distortion.
  • Figure 2: CirT architecture and circular patching examples. The input tensors are first decomposed by latitudes, resulting in a set of circular patches. Then they are fed into a series of Transformer blocks where DFT and IDFT are applied in each block to transform information between frequency and spatial domain. Finally, the output head maps the representation to biweekly predictions.
  • Figure 3: RMSE comparison between CirT and data-driven and numerical methods on geopotential $z$, temperature $t$, wind $u$, and $v$ of different pressure levels. FCN, GC, and PW are short for FourCastNetV2, GraphCast, and PanguWeather. A lighter color indicates better results: CirT consistently outperforms all models.
  • Figure 4: The global RMSE distribution of t850 with lead times weeks 3-4 in testing set: CirT demonstrates significant performance across different areas.
  • Figure 5: The monthly RMSE of t500 in testing set: CirT outperforms baselines across all months.
  • ...and 11 more figures