Challenges of learning multi-scale dynamics with AI weather models: Implications for stability and one solution
Ashesh Chattopadhyay, Y. Qiang Sun, Pedram Hassanzadeh
TL;DR
This paper identifies spectral bias as the universal cause of instability and unphysical drift in AI weather models when time-integrated to long horizons. It introduces FouRKS, an architecture-agnostic framework that combines Fourier-based spectral regularization, a convergent RK4 time integrator, and a self-supervised spectrum correction to enforce physical consistency during autoregressive prediction. The authors demonstrate that FouRKS yields long-term stable, physically accurate climate emulations on a two-layer quasi-geostrophic system for hundreds of thousands of days and on ERA5 data for up to a decade, with correct means and spectral structure. These results suggest a path toward reliable, data-driven climate emulation and improved sub-seasonal-to-seasonal forecasting, while acknowledging limitations and avenues for further theory and generalization to radiative forcing and full climate models.
Abstract
Long-term stability and physical consistency are critical properties for AI-based weather models if they are going to be used for subseasonal-to-seasonal forecasts or beyond, e.g., climate change projection. However, current AI-based weather models can only provide short-term forecasts accurately since they become unstable or physically inconsistent when time-integrated beyond a few weeks or a few months. Either they exhibit numerical blow-up or hallucinate unrealistic dynamics of the atmospheric variables, akin to the current class of autoregressive large language models. The cause of the instabilities is unknown, and the methods that are used to improve their stability horizons are ad-hoc and lack rigorous theory. In this paper, we reveal that the universal causal mechanism for these instabilities in any turbulent flow is due to \textit{spectral bias} wherein, \textit{any} deep learning architecture is biased to learn only the large-scale dynamics and ignores the small scales completely. We further elucidate how turbulence physics and the absence of convergence in deep learning-based time-integrators amplify this bias, leading to unstable error propagation. Finally, using the quasi-geostrophic flow and European Center for Medium-Range Weather Forecasting (ECMWF) Reanalysis data as test cases, we bridge the gap between deep learning theory and numerical analysis to propose one mitigative solution to such unphysical behavior. We develop long-term physically-consistent data-driven models for the climate system and demonstrate accurate short-term forecasts, and hundreds of years of time-integration with accurate mean and variability.
