WeatherMesh-3: Fast and accurate operational global weather forecasting
Haoxing Du, Lyna Kim, Joan Creus-Costa, Jack Michaels, Anuj Shetty, Todd Hutchinson, Christopher Riedel, John Dean
TL;DR
WM-3 tackles the challenge of delivering fast, accurate global weather forecasts with limited hardware by introducing a latent-rollout transformer framework that preserves latent-space continuity across forecast steps. The encoder–processor–decoder architecture, powered by NATTEN neighborhood attention and rotary embeddings, enables arbitrary lead-time predictions through repeated latent-space steps, while pretraining on ERA-5 and operational fine-tuning with IFS/GFS analyses support real-time use. Empirically, WM-3 achieves state-of-the-art accuracy relative to operational models (e.g., up to $37.7\%$ RMSE improvement for $2$‑meter temperature at 1 day) and reduces forecast blur, all while delivering a $14$‑day forecast in $12$ seconds on a single RTX $4090$ and running on consumer-grade hardware. The work also emphasizes accessibility and extensibility, with modular encoders for additional data sources and open-source tooling to facilitate broader deployment and future work in data assimilation and ensemble methods.
Abstract
We present WeatherMesh-3 (WM-3), an operational transformer-based global weather forecasting system that improves the state of the art in both accuracy and computational efficiency. We introduce the following advances: 1) a latent rollout that enables arbitrary-length predictions in latent space without intermediate encoding or decoding; and 2) a modular architecture that flexibly utilizes mixed-horizon processors and encodes multiple real-time analyses to create blended initial conditions. WM-3 generates 14-day global forecasts at 0.25-degree resolution in 12 seconds on a single RTX 4090. This represents a >100,000-fold speedup over traditional NWP approaches while achieving superior accuracy with up to 37.7% improvement in RMSE over operational models, requiring only a single consumer-grade GPU for deployment. We aim for WM-3 to democratize weather forecasting by providing an accessible, lightweight model for operational use while pushing the performance boundaries of machine learning-based weather prediction.
