VarteX: Enhancing Weather Forecast through Distributed Variable Representation
Ayumu Ueyama, Kazuhiko Kawamoto, Hiroshi Kera
TL;DR
VarteX tackles learning from high-dimensional meteorological data by introducing $R$ representative variables with per-representative embedding $D/R$, and a regional split training scheme that reduces attention-based computation by $\mathcal{O}(1/S^2)$. It extends ClimaX's aggregation by enabling the model to process multiple representatives and a mixing stage to capture interactions. Empirical results on WeatherBench ERA5 show VarteX achieves about 50% higher forecast accuracy (and larger gains for wind) with roughly 55% fewer parameters, 50% less training time, and 35% less memory than ClimaX. The work demonstrates that explicit multi-representative-variable representations and region-wise training are practical for efficient, high-accuracy DL-based weather forecasting and may scale to foundation-model regimes.
Abstract
Weather forecasting is essential for various human activities. Recent data-driven models have outperformed numerical weather prediction by utilizing deep learning in forecasting performance. However, challenges remain in efficiently handling multiple meteorological variables. This study proposes a new variable aggregation scheme and an efficient learning framework for that challenge. Experiments show that VarteX outperforms the conventional model in forecast performance, requiring significantly fewer parameters and resources. The effectiveness of learning through multiple aggregations and regional split training is demonstrated, enabling more efficient and accurate deep learning-based weather forecasting.
