Table of Contents
Fetching ...

Mitigating Time Discretization Challenges with WeatherODE: A Sandwich Physics-Driven Neural ODE for Weather Forecasting

Peiyuan Liu, Tian Zhou, Liang Sun, Rong Jin

TL;DR

WeatherODE tackles time-discretization errors and evolving source discrepancies in weather forecasting by embedding physics into a one-stage neural ODE. It couples a wave-equation–informed velocity model, a slower-converging ViT-based advection ODE, and a time-dependent source model within a CNN–ViT–CNN sandwich, trained with multi-task supervision across intermediate steps. The approach yields substantial RMSE gains over state-of-the-art baselines on global and regional ERA5 forecasts and demonstrates flexible inference with a single 24-hour model. By combining physical principles with tailored architectural biases, WeatherODE offers a scalable, accurate framework for hybrid weather prediction with improved stability and generalization.

Abstract

In the field of weather forecasting, traditional models often grapple with discretization errors and time-dependent source discrepancies, which limit their predictive performance. In this paper, we present WeatherODE, a novel one-stage, physics-driven ordinary differential equation (ODE) model designed to enhance weather forecasting accuracy. By leveraging wave equation theory and integrating a time-dependent source model, WeatherODE effectively addresses the challenges associated with time-discretization error and dynamic atmospheric processes. Moreover, we design a CNN-ViT-CNN sandwich structure, facilitating efficient learning dynamics tailored for distinct yet interrelated tasks with varying optimization biases in advection equation estimation. Through rigorous experiments, WeatherODE demonstrates superior performance in both global and regional weather forecasting tasks, outperforming recent state-of-the-art approaches by significant margins of over 40.0\% and 31.8\% in root mean square error (RMSE), respectively. The source code is available at \url{https://github.com/DAMO-DI-ML/WeatherODE}.

Mitigating Time Discretization Challenges with WeatherODE: A Sandwich Physics-Driven Neural ODE for Weather Forecasting

TL;DR

WeatherODE tackles time-discretization errors and evolving source discrepancies in weather forecasting by embedding physics into a one-stage neural ODE. It couples a wave-equation–informed velocity model, a slower-converging ViT-based advection ODE, and a time-dependent source model within a CNN–ViT–CNN sandwich, trained with multi-task supervision across intermediate steps. The approach yields substantial RMSE gains over state-of-the-art baselines on global and regional ERA5 forecasts and demonstrates flexible inference with a single 24-hour model. By combining physical principles with tailored architectural biases, WeatherODE offers a scalable, accurate framework for hybrid weather prediction with improved stability and generalization.

Abstract

In the field of weather forecasting, traditional models often grapple with discretization errors and time-dependent source discrepancies, which limit their predictive performance. In this paper, we present WeatherODE, a novel one-stage, physics-driven ordinary differential equation (ODE) model designed to enhance weather forecasting accuracy. By leveraging wave equation theory and integrating a time-dependent source model, WeatherODE effectively addresses the challenges associated with time-discretization error and dynamic atmospheric processes. Moreover, we design a CNN-ViT-CNN sandwich structure, facilitating efficient learning dynamics tailored for distinct yet interrelated tasks with varying optimization biases in advection equation estimation. Through rigorous experiments, WeatherODE demonstrates superior performance in both global and regional weather forecasting tasks, outperforming recent state-of-the-art approaches by significant margins of over 40.0\% and 31.8\% in root mean square error (RMSE), respectively. The source code is available at \url{https://github.com/DAMO-DI-ML/WeatherODE}.

Paper Structure

This paper contains 38 sections, 15 equations, 10 figures, 12 tables.

Figures (10)

  • Figure 1: (a) Comparison of two-meter temperature ($t2m$) and its discrete-time derivative over a 1-hour interval. While the temperature evolves continuously, the discrete-time derivative exhibits discontinuities, leading to discretization errors. (b) Latitude-weighted RMSE for $t2m$ using models trained with different time intervals ($\Delta t$) for estimating initial velocity. Larger $\Delta t$ values result in worse performance and can even lead to numerical instability (NaN). See \ref{['tab:ablation_time_interval']} for full results. (c) Comparison of temporal and spatial discretization intervals in the 5.625° ERA5 dataset. The spatial discretization is 100 times denser than the temporal discretization.
  • Figure 2: Overall architecture of WeatherODE. WeatherODE adopts a sandwich-like structure for atmosphere modeling. The top and bottom parts use fast-converging neural networks (CNN-based) to estimate the initial velocity and source term, while the central layer employs a slower-converging neural ODE (ViT-based) to model the atmospheric advection process. This design ensures stability when training the neural ODE to solve the numerical solution. More analyses are in \ref{['sec:sandwich']} and \ref{['sec:abla_optimize']}.
  • Figure 3: RMSE comparison for different input configurations of the velocity model.
  • Figure 4: Visualization of the 2-meter temperature $u$ on January 1, 2017, from 3 a.m. to 10 a.m., with the estimated $\frac{\partial u}{\partial t}$ from ClimODE and WeatherODE. WeatherODE provides smoother, more continuous estimates of $\frac{\partial u}{\partial t}$, closely matching $u$, while ClimODE shows abrupt changes.
  • Figure 5: RMSE comparison for different architectures of the source model.
  • ...and 5 more figures