Table of Contents
Fetching ...

Numerical models outperform AI weather forecasts of record-breaking extremes

Zhongwei Zhang, Erich Fischer, Jakob Zscheischler, Sebastian Engelke

TL;DR

It is shown that for record-breaking weather extremes, the numerical model High RESolution forecast (HRES) from the European Centre for Medium-Range Weather Forecasts still consistently outperforms state-of-the-art AI models GraphCast, GraphCast operational, Pangu-Weather, Pangu-Weather operational, and Fuxi.

Abstract

Artificial intelligence (AI)-based models are revolutionizing weather forecasting and have surpassed leading numerical weather prediction systems on various benchmark tasks. However, their ability to extrapolate and reliably forecast unprecedented extreme events remains unclear. Here, we show that for record-breaking weather extremes, the numerical model High RESolution forecast (HRES) from the European Centre for Medium-Range Weather Forecasts still consistently outperforms state-of-the-art AI models GraphCast, GraphCast operational, Pangu-Weather, Pangu-Weather operational, and Fuxi. We demonstrate that forecast errors in AI models are consistently larger for record-breaking heat, cold, and wind than in HRES across nearly all lead times. We further find that the examined AI models tend to underestimate both the frequency and intensity of record-breaking events, and they underpredict hot records and overestimate cold records with growing errors for larger record exceedance. Our findings underscore the current limitations of AI weather models in extrapolating beyond their training domain and in forecasting the potentially most impactful record-breaking weather events that are particularly frequent in a rapidly warming climate. Further rigorous verification and model development is needed before these models can be solely relied upon for high-stakes applications such as early warning systems and disaster management.

Numerical models outperform AI weather forecasts of record-breaking extremes

TL;DR

It is shown that for record-breaking weather extremes, the numerical model High RESolution forecast (HRES) from the European Centre for Medium-Range Weather Forecasts still consistently outperforms state-of-the-art AI models GraphCast, GraphCast operational, Pangu-Weather, Pangu-Weather operational, and Fuxi.

Abstract

Artificial intelligence (AI)-based models are revolutionizing weather forecasting and have surpassed leading numerical weather prediction systems on various benchmark tasks. However, their ability to extrapolate and reliably forecast unprecedented extreme events remains unclear. Here, we show that for record-breaking weather extremes, the numerical model High RESolution forecast (HRES) from the European Centre for Medium-Range Weather Forecasts still consistently outperforms state-of-the-art AI models GraphCast, GraphCast operational, Pangu-Weather, Pangu-Weather operational, and Fuxi. We demonstrate that forecast errors in AI models are consistently larger for record-breaking heat, cold, and wind than in HRES across nearly all lead times. We further find that the examined AI models tend to underestimate both the frequency and intensity of record-breaking events, and they underpredict hot records and overestimate cold records with growing errors for larger record exceedance. Our findings underscore the current limitations of AI weather models in extrapolating beyond their training domain and in forecasting the potentially most impactful record-breaking weather events that are particularly frequent in a rapidly warming climate. Further rigorous verification and model development is needed before these models can be solely relied upon for high-stakes applications such as early warning systems and disaster management.

Paper Structure

This paper contains 3 sections, 11 equations, 25 figures.

Figures (25)

  • Figure 1: Model performance on all events and record-breaking events.a, Number of heat records in 2020 in ERA5. b, Number of heat records per latitude. c--g, Root mean square error (RMSE) of forecasted 2m temperature and 10m wind speed over land (excluding the Antarctic region) of HRES, Pangu-Weather, GraphCast, and Fuxi for all events (c, f) and only record-breaking events (d, e, g) in 2020 for different lead times. The transparent shaded areas indicate $95\%$ confidence bands.
  • Figure 2: Forecast bias against record exceedance.a, Forecast bias of the maximum heat records (GraphCast). b--d, RMSE of 2m temperature for heat and cold records, and 10m wind speed for wind records for events in 2020 that exceed the record by at least a certain margin (x-axis). Only land pixels (excluding the Antarctic region) are considered. e--g, Forecast bias of heat, cold, and wind records, for events that exceed the record by at least a certain margin. The transparent shaded areas indicate $95\%$ confidence bands.
  • Figure 3: Prediction of occurrence of record-breaking events.a--c, Counts (in thousands) of heat, cold, and wind records in the ground truth ERA5 and HRES-fc0 data, and GraphCast and HRES forecast data for 2020, as well as counts of their true positives (TP) over land (excluding the Antarctic region). d--f, Precision and recall curves of GraphCast and HRES forecasts when the records are used as the threshold for different lead times. g--i, Correlations between the indicator functions of whether the ground truth or 2-day forecasts exceed the record.
  • Figure 4: Illustration of our definitions of record and extrapolation. a, Daily time series of 2m temperature at the location with latitude $34.75$ and longitude $-112.25$ in 2020 (black), and monthly max/min records (in green) at this location, where orange points indicate the record-breaking events in August. b, Daily time series of 10m wind speed and monthly max records at the same location. c, Scatter plots of 2m temperature and 10m wind speed in August in the training period from 1979--2017 (in grey) and in the evaluation year 2020 (in black) at this location. The blue line represents the convex hull formed by the training data, while the green rectangle shows the max/min records in the training period. Orange points indicate the record-breaking events in the evaluation period.
  • Figure 5: Number of record-breaking events over land (excluding the Antarctic region) in 2020 in ERA5. a and c, Number of cold and wind records. b and d, Number of cold and wind records per latitude.
  • ...and 20 more figures