W-MAE: Pre-trained weather model with masked autoencoder for multi-variable weather forecasting

Xin Man; Chenghong Zhang; Jin Feng; Changyu Li; Jie Shao

W-MAE: Pre-trained weather model with masked autoencoder for multi-variable weather forecasting

Xin Man, Chenghong Zhang, Jin Feng, Changyu Li, Jie Shao

TL;DR

W-MAE introduces a Weather model with Masked AutoEncoder pre-training built on a Vision Transformer to capture spatial correlations across multiple meteorological variables. The approach uses self-supervised MAE pre-training on ERA5 data, then fine-tunes for multi-variable forecasting and precipitation tasks, yielding robust short-to-medium horizon performance and superior precipitation accuracy relative to FourCastNet. The method enables easy transfer to other task-specific models and demonstrates significant gains in accuracy and efficiency for ensemble forecasts, with practical training-time advantages. This work highlights the value of task-agnostic pre-training for weather and climate forecasting, suggesting potential extensions to longer-term forecasting.

Abstract

Weather forecasting is a long-standing computational challenge with direct societal and economic impacts. This task involves a large amount of continuous data collection and exhibits rich spatiotemporal dependencies over long periods, making it highly suitable for deep learning models. In this paper, we apply pre-training techniques to weather forecasting and propose W-MAE, a Weather model with Masked AutoEncoder pre-training for weather forecasting. W-MAE is pre-trained in a self-supervised manner to reconstruct spatial correlations within meteorological variables. On the temporal scale, we fine-tune the pre-trained W-MAE to predict the future states of meteorological variables, thereby modeling the temporal dependencies present in weather data. We conduct our experiments using the fifth-generation ECMWF Reanalysis (ERA5) data, with samples selected every six hours. Experimental results show that our W-MAE framework offers three key benefits: 1) when predicting the future state of meteorological variables, the utilization of our pre-trained W-MAE can effectively alleviate the problem of cumulative errors in prediction, maintaining stable performance in the short-to-medium term; 2) when predicting diagnostic variables (e.g., total precipitation), our model exhibits significant performance advantages over FourCastNet; 3) Our task-agnostic pre-training schema can be easily integrated with various task-specific models. When our pre-training framework is applied to FourCastNet, it yields an average 20% performance improvement in Anomaly Correlation Coefficient (ACC).

W-MAE: Pre-trained weather model with masked autoencoder for multi-variable weather forecasting

TL;DR

Abstract

Paper Structure (18 sections, 11 equations, 9 figures, 2 tables)

This paper contains 18 sections, 11 equations, 9 figures, 2 tables.

Introduction
Related work
Numerical weather prediction
AI-based weather forecasting
Self-supervised pre-training techniques
Preliminaries
Dataset
Multi-variable and precipitation tasks
Pre-training method
VIT and MAE
W-MAE decoder
Experiments
Implementation details of task-agnostic pre-training
Fine-tuning for multi-variable forecasting
Fine-tuning for precipitation forecasting
...and 3 more sections

Figures (9)

Figure 1: A showcase of W-MAE pre-trained on the ERA5 dataset and then fine-tuned on two weather forecasting tasks, i.e., multi-variable forecasting and precipitation forecasting. The word in the grey boxes indicates the dataset used for pre-training and fine-tuning our W-MAE.
Figure 2: Our proposed W-MAE architecture. Following the MAE framework, an input image is first divided into patches and masked before being passed into the encoder. The encoder only processes the unmasked patches. After the encoder, the removed patches are then placed back into their original locations in the sequence of patches and fed into our W-MAE decoder to reconstruct the missing pixels.
Figure 3: The latitude-weighted ACC and RMSE curves for our W-MAE forecasts and the corresponding matched FourCastNet forecasts at a fixed initial condition in the testing dataset corresponding to the calendar year 2018 for the variables, including $Z_{500}$, $T_{850}$, $V_{10}$, and $U_{10}$.
Figure 4: Visualization examples of future state prediction for meteorological variables, including $Z_{500}$, $T_{850}$, $V_{10}$, and $U_{10}$.
Figure 5: Latitude weighted ACC and RMSE curves for our W-MAE forecasts and the corresponding matched FourCastNet forecasts at a fixed initial condition in the testing dataset corresponding to the calendar year 2018 for TP. Please note that the results of "FourCastNet (37 years)" are obtained using the checkpoint file released by Pathak et al. DBLP:journals/corr/abs-2202-11214, representing the optimal results after two rounds of training on FourCastNet.
...and 4 more figures

W-MAE: Pre-trained weather model with masked autoencoder for multi-variable weather forecasting

TL;DR

Abstract

W-MAE: Pre-trained weather model with masked autoencoder for multi-variable weather forecasting

Authors

TL;DR

Abstract

Table of Contents

Figures (9)