Table of Contents
Fetching ...

Predicting Change, Not States: An Alternate Framework for Neural PDE Surrogates

Anthony Zhou, Amir Barati Farimani

TL;DR

This work questions the common practice of predicting the next PDE state with neural surrogates and instead proposes learning the temporal derivative $\frac{\partial \boldsymbol{u}}{\partial t}$ so an ODE integrator can advance the solution in time. The framework is model- and PDE-agnostic, demonstrated with Fourier Neural Operators and Unets across 1D and 2D equations, and shown to improve accuracy and stability in finely discretized regimes while enabling flexible time stepping. Through extensive comparisons with state-prediction and various training/inference modifiers, the authors provide empirical evidence that derivative prediction can yield better rollouts and longer correlation times, albeit with numerical integration errors that must be managed by choosing appropriate integrators and time steps. The results offer practical guidance for when to adopt derivative prediction, including insights into loss landscapes, error propagation, and computational trade-offs, and the work releases code and data to support broader adoption and further research.

Abstract

Neural surrogates for partial differential equations (PDEs) have become popular due to their potential to quickly simulate physics. With a few exceptions, neural surrogates generally treat the forward evolution of time-dependent PDEs as a black box by directly predicting the next state. While this is a natural and easy framework for applying neural surrogates, it can be an over-simplified and rigid framework for predicting physics. In this work, we evaluate an alternate framework in which neural solvers predict the temporal derivative and an ODE integrator forwards the solution in time, which has little overhead and is broadly applicable across model architectures and PDEs. We find that by simply changing the training target and introducing numerical integration during inference, neural surrogates can gain accuracy and stability in finely-discretized regimes. Predicting temporal derivatives also allows models to not be constrained to a specific temporal discretization, allowing for flexible time-stepping during inference or training on higher-resolution PDE data. Lastly, we investigate why this framework can be beneficial and in what situations does it work well.

Predicting Change, Not States: An Alternate Framework for Neural PDE Surrogates

TL;DR

This work questions the common practice of predicting the next PDE state with neural surrogates and instead proposes learning the temporal derivative so an ODE integrator can advance the solution in time. The framework is model- and PDE-agnostic, demonstrated with Fourier Neural Operators and Unets across 1D and 2D equations, and shown to improve accuracy and stability in finely discretized regimes while enabling flexible time stepping. Through extensive comparisons with state-prediction and various training/inference modifiers, the authors provide empirical evidence that derivative prediction can yield better rollouts and longer correlation times, albeit with numerical integration errors that must be managed by choosing appropriate integrators and time steps. The results offer practical guidance for when to adopt derivative prediction, including insights into loss landscapes, error propagation, and computational trade-offs, and the work releases code and data to support broader adoption and further research.

Abstract

Neural surrogates for partial differential equations (PDEs) have become popular due to their potential to quickly simulate physics. With a few exceptions, neural surrogates generally treat the forward evolution of time-dependent PDEs as a black box by directly predicting the next state. While this is a natural and easy framework for applying neural surrogates, it can be an over-simplified and rigid framework for predicting physics. In this work, we evaluate an alternate framework in which neural solvers predict the temporal derivative and an ODE integrator forwards the solution in time, which has little overhead and is broadly applicable across model architectures and PDEs. We find that by simply changing the training target and introducing numerical integration during inference, neural surrogates can gain accuracy and stability in finely-discretized regimes. Predicting temporal derivatives also allows models to not be constrained to a specific temporal discretization, allowing for flexible time-stepping during inference or training on higher-resolution PDE data. Lastly, we investigate why this framework can be beneficial and in what situations does it work well.

Paper Structure

This paper contains 19 sections, 9 equations, 7 figures, 5 tables.

Figures (7)

  • Figure 1: A comparison of state prediction and derivative prediction, where models are either trained to predict $\mathbf{u}_{n+1}$ or $\frac{\partial\mathbf{u}}{\partial t}|_{t=t_n}$. During inference, models are given an initial solution $\mathbf{u}_n$, and predict future solutions along the dashed trajectory. By predicting the temporal derivatives rather than the future solution, derivative prediction can learn spatial updates while an ODE integrator updates the solution in time, which can improve accuracy. Furthermore, derivative prediction can use higher-order integrators or variable timesteps, which further improves its accuracy and flexibility, while being applicable across model architectures and datasets.
  • Figure 2: 1D KS Equation. Comparison of state prediction and derivative prediction using an RK4 integrator on the 1D KS equation. Time is plotted on the x-axis and nodal values on the y-axis. Correlation time is denoted with a dashed white line.
  • Figure 3: 2D Kolmogorov Flow. Comparison of state prediction and derivative prediction using an RK4 integrator on 2D Kolmogorov Flow. The spatial dimensions are plotted for each frame, at multiple snapshots in time from top to bottom. The rollout is visualized at times $0, \frac{T}{3}, \frac{2T}{3}, T$, where T is the prediction horizon.
  • Figure 4: Prediction Timescales. Rollout error of FNO or Unet models trained with state prediction or derivative prediction with a Forward Euler or RK4 integrator; errors are plotted after predicting solutions at different timescales $\Delta t$. The resolution at which $CFL=1$ is denoted for 1D Advection.
  • Figure 5: Loss Landscapes. Visualization of the loss landscape of state prediction and derivative prediction after training a Unet on 1D Advection. Trained model parameters $\theta^*$ are varied by a linear combination of random direction vectors $\delta,\gamma$ at sampled $a, b$ values. Validation rollout loss for each perturbed set of weights $\theta^* + a\delta + b\gamma$ are calculated and averaged across the validation set, which is plotted as contours or surfaces. The scale and threshold of $L$ is kept consistent across contour and surface plots for a given model/framework.
  • ...and 2 more figures