Table of Contents
Fetching ...

WeatherFormer: Empowering Global Numerical Weather Forecasting with Space-Time Transformer

Junchao Gong, Tao Han, Kang Chen, Lei Bai

TL;DR

This work proposes a new transformer-based NWP framework, termed as WeatherFormer, to model the complex spatio-temporal atmosphere dynamics and empowering the capability of data-driven NWP, and achieves superior performance over existing deep learning methods.

Abstract

Numerical Weather Prediction (NWP) system is an infrastructure that exerts considerable impacts on modern society.Traditional NWP system, however, resolves it by solving complex partial differential equations with a huge computing cluster, resulting in tons of carbon emission. Exploring efficient and eco-friendly solutions for NWP attracts interest from Artificial Intelligence (AI) and earth science communities. To narrow the performance gap between the AI-based methods and physic predictor, this work proposes a new transformer-based NWP framework, termed as WeatherFormer, to model the complex spatio-temporal atmosphere dynamics and empowering the capability of data-driven NWP. WeatherFormer innovatively introduces the space-time factorized transformer blocks to decrease the parameters and memory consumption, in which Position-aware Adaptive Fourier Neural Operator (PAFNO) is proposed for location sensible token mixing. Besides, two data augmentation strategies are utilized to boost the performance and decrease training consumption. Extensive experiments on WeatherBench dataset show WeatherFormer achieves superior performance over existing deep learning methods and further approaches the most advanced physical model.

WeatherFormer: Empowering Global Numerical Weather Forecasting with Space-Time Transformer

TL;DR

This work proposes a new transformer-based NWP framework, termed as WeatherFormer, to model the complex spatio-temporal atmosphere dynamics and empowering the capability of data-driven NWP, and achieves superior performance over existing deep learning methods.

Abstract

Numerical Weather Prediction (NWP) system is an infrastructure that exerts considerable impacts on modern society.Traditional NWP system, however, resolves it by solving complex partial differential equations with a huge computing cluster, resulting in tons of carbon emission. Exploring efficient and eco-friendly solutions for NWP attracts interest from Artificial Intelligence (AI) and earth science communities. To narrow the performance gap between the AI-based methods and physic predictor, this work proposes a new transformer-based NWP framework, termed as WeatherFormer, to model the complex spatio-temporal atmosphere dynamics and empowering the capability of data-driven NWP. WeatherFormer innovatively introduces the space-time factorized transformer blocks to decrease the parameters and memory consumption, in which Position-aware Adaptive Fourier Neural Operator (PAFNO) is proposed for location sensible token mixing. Besides, two data augmentation strategies are utilized to boost the performance and decrease training consumption. Extensive experiments on WeatherBench dataset show WeatherFormer achieves superior performance over existing deep learning methods and further approaches the most advanced physical model.
Paper Structure (27 sections, 11 equations, 5 figures, 3 tables)

This paper contains 27 sections, 11 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: The prediction (Pred) and ground truth (GT) geopotential on the height of 500 hPa at 24-hour, 72-hour, and 120-hour, separately.
  • Figure 2: Mapping data points with the same color from sphere to plate. Note that, the red point is mapped to the left and right sides of the plate which means the left and right sides are continuous. Arrows indicate rotation in the sphere and shifting in the plate.
  • Figure 3: (a) Overview of WeatherFormer. It first divides a sequence of weather states into patch tokens. These tokens are then processed by $L$ layers of SF-Block, which contains a Fourier spatial mixer and a Fourier temporal mixer. Finally, a convolution decoder decodes the output tokens to the future weather states. (b) Details of our Fourier mixer. It first transforms token features to frequency domain with fast Fourier transform. Then, frequency features are multiplied by PAFNO filters which consist of $k$ Multilayer Perceptrons (MLPs) and $n$ frequency coefficients. Finally, frequency tokens are transformed back to spatial/temporal domain.
  • Figure 4: Adaptive weights of PAFNO (bottom) to neighbours than AFNO (top). Weights (Y-axis) are from inverse DFT coefficients of $\lambda_n$ and X-axis denotes the spatial distance to the pivot token.
  • Figure 5: Wind speed at 10m above the surface. The more redder color represents the higher wind speed. Red box identifies the location of Rumbia tropical cyclone at a given timestamp.