Predicting Change, Not States: An Alternate Framework for Neural PDE Surrogates
Anthony Zhou, Amir Barati Farimani
TL;DR
This work questions the common practice of predicting the next PDE state with neural surrogates and instead proposes learning the temporal derivative $\frac{\partial \boldsymbol{u}}{\partial t}$ so an ODE integrator can advance the solution in time. The framework is model- and PDE-agnostic, demonstrated with Fourier Neural Operators and Unets across 1D and 2D equations, and shown to improve accuracy and stability in finely discretized regimes while enabling flexible time stepping. Through extensive comparisons with state-prediction and various training/inference modifiers, the authors provide empirical evidence that derivative prediction can yield better rollouts and longer correlation times, albeit with numerical integration errors that must be managed by choosing appropriate integrators and time steps. The results offer practical guidance for when to adopt derivative prediction, including insights into loss landscapes, error propagation, and computational trade-offs, and the work releases code and data to support broader adoption and further research.
Abstract
Neural surrogates for partial differential equations (PDEs) have become popular due to their potential to quickly simulate physics. With a few exceptions, neural surrogates generally treat the forward evolution of time-dependent PDEs as a black box by directly predicting the next state. While this is a natural and easy framework for applying neural surrogates, it can be an over-simplified and rigid framework for predicting physics. In this work, we evaluate an alternate framework in which neural solvers predict the temporal derivative and an ODE integrator forwards the solution in time, which has little overhead and is broadly applicable across model architectures and PDEs. We find that by simply changing the training target and introducing numerical integration during inference, neural surrogates can gain accuracy and stability in finely-discretized regimes. Predicting temporal derivatives also allows models to not be constrained to a specific temporal discretization, allowing for flexible time-stepping during inference or training on higher-resolution PDE data. Lastly, we investigate why this framework can be beneficial and in what situations does it work well.
