FourCastNet 3: A geometric approach to probabilistic machine-learning weather forecasting at scale
Boris Bonev, Thorsten Kurth, Ankur Mahesh, Mauro Bisson, Jean Kossaifi, Karthik Kashinath, Anima Anandkumar, William D. Collins, Michael S. Pritchard, Alexander Keller
TL;DR
The work tackles the challenge of fast, probabilistic, global weather forecasting at high resolution while preserving physical spectral fidelity. It introduces FourCastNet 3 (FCN3), a spherical geometry–aware, purely convolutional neural operator framework integrated with a hidden Markov model to generate ensembles, trained with a joint spatial–spectral CRPS objective. FCN3 achieves state-of-the-art probabilistic skill with substantial speed advantages over traditional NWP and diffusion-based baselines, and demonstrates stable spectra out to 60 days. The approach is scalable to thousands of GPUs via Makani and is openly released, positioning FCN3 as a practical, extensible platform for next-generation subseasonal forecasting with large ensembles.
Abstract
FourCastNet 3 advances global weather modeling by implementing a scalable, geometric machine learning (ML) approach to probabilistic ensemble forecasting. The approach is designed to respect spherical geometry and to accurately model the spatially correlated probabilistic nature of the problem, resulting in stable spectra and realistic dynamics across multiple scales. FourCastNet 3 delivers forecasting accuracy that surpasses leading conventional ensemble models and rivals the best diffusion-based methods, while producing forecasts 8 to 60 times faster than these approaches. In contrast to other ML approaches, FourCastNet 3 demonstrates excellent probabilistic calibration and retains realistic spectra, even at extended lead times of up to 60 days. All of these advances are realized using a purely convolutional neural network architecture tailored for spherical geometry. Scalable and efficient large-scale training on 1024 GPUs and more is enabled by a novel training paradigm for combined model- and data-parallelism, inspired by domain decomposition methods in classical numerical models. Additionally, FourCastNet 3 enables rapid inference on a single GPU, producing a 60-day global forecast at 0.25°, 6-hourly resolution in under 4 minutes. Its computational efficiency, medium-range probabilistic skill, spectral fidelity, and rollout stability at subseasonal timescales make it a strong candidate for improving meteorological forecasting and early warning systems through large ensemble predictions.
