Table of Contents
Fetching ...

FourCastNet: Accelerating Global High-Resolution Weather Forecasting using Adaptive Fourier Neural Operators

Thorsten Kurth, Shashank Subramanian, Peter Harrington, Jaideep Pathak, Morteza Mardani, David Hall, Andrea Miele, Karthik Kashinath, Animashree Anandkumar

TL;DR

This work addresses the challenge that physics-based numerical weather prediction remains computationally expensive for high-resolution, global forecasts and large ensembles. It introduces FourCastNet, a data-driven Earth-system emulator built on an Adaptive Fourier Neural Operator (AFNO) transformer, achieving global forecasts at high resolution with drastically higher throughput than traditional NWP. The approach demonstrates state-of-the-art acceleration: training on JUWELS Booster in 67.4 minutes for a 3,072-GPU configuration, and inference delivering 12.41 seconds for a 100-member ensemble on a single Selene node, with peak scaling up to 140.8 PFLOPS on 3,808 GPUs. These results imply transformative potential for real-time, large-ensemble forecasting and digital-twin Earth applications, while outlining hardware-software needs for future exascale AI-enabled weather and climate computing.

Abstract

Extreme weather amplified by climate change is causing increasingly devastating impacts across the globe. The current use of physics-based numerical weather prediction (NWP) limits accuracy due to high computational cost and strict time-to-solution limits. We report that a data-driven deep learning Earth system emulator, FourCastNet, can predict global weather and generate medium-range forecasts five orders-of-magnitude faster than NWP while approaching state-of-the-art accuracy. FourCast-Net is optimized and scales efficiently on three supercomputing systems: Selene, Perlmutter, and JUWELS Booster up to 3,808 NVIDIA A100 GPUs, attaining 140.8 petaFLOPS in mixed precision (11.9%of peak at that scale). The time-to-solution for training FourCastNet measured on JUWELS Booster on 3,072GPUs is 67.4minutes, resulting in an 80,000times faster time-to-solution relative to state-of-the-art NWP, in inference. FourCastNet produces accurate instantaneous weather predictions for a week in advance, enables enormous ensembles that better capture weather extremes, and supports higher global forecast resolutions.

FourCastNet: Accelerating Global High-Resolution Weather Forecasting using Adaptive Fourier Neural Operators

TL;DR

This work addresses the challenge that physics-based numerical weather prediction remains computationally expensive for high-resolution, global forecasts and large ensembles. It introduces FourCastNet, a data-driven Earth-system emulator built on an Adaptive Fourier Neural Operator (AFNO) transformer, achieving global forecasts at high resolution with drastically higher throughput than traditional NWP. The approach demonstrates state-of-the-art acceleration: training on JUWELS Booster in 67.4 minutes for a 3,072-GPU configuration, and inference delivering 12.41 seconds for a 100-member ensemble on a single Selene node, with peak scaling up to 140.8 PFLOPS on 3,808 GPUs. These results imply transformative potential for real-time, large-ensemble forecasting and digital-twin Earth applications, while outlining hardware-software needs for future exascale AI-enabled weather and climate computing.

Abstract

Extreme weather amplified by climate change is causing increasingly devastating impacts across the globe. The current use of physics-based numerical weather prediction (NWP) limits accuracy due to high computational cost and strict time-to-solution limits. We report that a data-driven deep learning Earth system emulator, FourCastNet, can predict global weather and generate medium-range forecasts five orders-of-magnitude faster than NWP while approaching state-of-the-art accuracy. FourCast-Net is optimized and scales efficiently on three supercomputing systems: Selene, Perlmutter, and JUWELS Booster up to 3,808 NVIDIA A100 GPUs, attaining 140.8 petaFLOPS in mixed precision (11.9%of peak at that scale). The time-to-solution for training FourCastNet measured on JUWELS Booster on 3,072GPUs is 67.4minutes, resulting in an 80,000times faster time-to-solution relative to state-of-the-art NWP, in inference. FourCastNet produces accurate instantaneous weather predictions for a week in advance, enables enormous ensembles that better capture weather extremes, and supports higher global forecast resolutions.
Paper Structure (27 sections, 4 figures, 3 tables)

This paper contains 27 sections, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Comparison of performance metrics (ACC: Anomaly Correlation Coefficient and RMSE: Root Mean Squared Error) between downsampled FourCastNet predictions, downsampled IFS, and baseline state-of-the-art deep learning weather prediction (DLWP) model weyn2021sub for $Z_{500}$, geopotential height at 500 hPa, a key determinant of global weather patterns. FourCastNet significantly outperforms DLWP and predicts at 8X higher resolution.
  • Figure 2: The AFNO architecture showing the key operations performed on the the input tensor with dimensions ($20 \times 720 \times 1440$) to produce a 6 hour single time step forecast with the same dimensions. Model parallelism is implemented by splitting the channels (feature maps) across GPUs. Channel mixing MLP operations require communication across the model parallel ranks, while the FFT based spatial-mixing operates on disjoint blocks that are embarrassingly parallel.
  • Figure 3: FourCastNet scaling on JUWELS Booster (top), Perlmutter (center) and Selene (bottom) for various model instance sizes.
  • Figure 4: Validation loss as a function of wall-clock time for various FourCastNet configurations on JUWELS Booster, Perlmutter and Selene. The plot shows a significant reduction in solution times as parallelism increases.