Table of Contents
Fetching ...

Neural Operator Learning for Long-Time Integration in Dynamical Systems with Recurrent Neural Networks

Katarzyna Michałowska, Somdatta Goswami, George Em Karniadakis, Signe Riemer-Sørensen

TL;DR

This work tackles the difficulty of long-time horizon prediction with neural-operator surrogates by coupling neural operators (DeepONet or FNO) with recurrent networks (RNN, GRU, LSTM). The proposed operator–RNN hybrids, trained in simultaneous or two-step fashions, stabilize trajectories and reduce error accumulation for both interpolation and extrapolation on the KdV equation. Key findings show that GRU/LSTM variants typically yield the best accuracy, and simultaneous training often offers the strongest gains, especially in preserving wave shapes during extrapolation. The results highlight the potential of discretization-invariant neural operators integrated with temporal models as fast, stable emulators for complex dynamical systems, while also underscoring the need for theoretical error analysis and broader applicability.

Abstract

Deep neural networks are an attractive alternative for simulating complex dynamical systems, as in comparison to traditional scientific computing methods, they offer reduced computational costs during inference and can be trained directly from observational data. Existing methods, however, cannot extrapolate accurately and are prone to error accumulation in long-time integration. Herein, we address this issue by combining neural operators with recurrent neural networks, learning the operator mapping, while offering a recurrent structure to capture temporal dependencies. The integrated framework is shown to stabilize the solution and reduce error accumulation for both interpolation and extrapolation of the Korteweg-de Vries equation.

Neural Operator Learning for Long-Time Integration in Dynamical Systems with Recurrent Neural Networks

TL;DR

This work tackles the difficulty of long-time horizon prediction with neural-operator surrogates by coupling neural operators (DeepONet or FNO) with recurrent networks (RNN, GRU, LSTM). The proposed operator–RNN hybrids, trained in simultaneous or two-step fashions, stabilize trajectories and reduce error accumulation for both interpolation and extrapolation on the KdV equation. Key findings show that GRU/LSTM variants typically yield the best accuracy, and simultaneous training often offers the strongest gains, especially in preserving wave shapes during extrapolation. The results highlight the potential of discretization-invariant neural operators integrated with temporal models as fast, stable emulators for complex dynamical systems, while also underscoring the need for theoretical error analysis and broader applicability.

Abstract

Deep neural networks are an attractive alternative for simulating complex dynamical systems, as in comparison to traditional scientific computing methods, they offer reduced computational costs during inference and can be trained directly from observational data. Existing methods, however, cannot extrapolate accurately and are prone to error accumulation in long-time integration. Herein, we address this issue by combining neural operators with recurrent neural networks, learning the operator mapping, while offering a recurrent structure to capture temporal dependencies. The integrated framework is shown to stabilize the solution and reduce error accumulation for both interpolation and extrapolation of the Korteweg-de Vries equation.
Paper Structure (15 sections, 5 equations, 5 figures, 4 tables)

This paper contains 15 sections, 5 equations, 5 figures, 4 tables.

Figures (5)

  • Figure 1: The architecture combining a deep neural operator (e.g., DeepONet or FNO) and a recurrent neural network architecture (Figure adapted from michalowska2023don). The neural operator learns the mapping from the initial condition $u_{t=0}$ to solutions at later timesteps. These solutions are then represented as a sequence and fed into an RNN, which outputs the solution.
  • Figure 2: E$1$: Interpolation performance. The models are both trained and tested on full trajectories ($200$ steps in $t\in[0,5]$). The error is given as the relative squared error over all spatial points $x$ for each time step $t$ on the test data. The enhanced models are compared to vanilla neural operators. The shaded area indicates the 95% confidence interval calculated over the three trained models (two standard deviations of the error).
  • Figure 3: E$2$: Extrapolation performance. The models are trained on partial trajectories ($50$ steps in $t\in[0,1.25]$) and tested on full trajectories ($200$ steps in $t\in [0,5]$). The error is given as the relative squared error over all spatial points $x$ for each time step $t$ on the test data. The panels show different combinations of vanilla operators and enhanced models. The shaded area indicates the 95% confidence interval calculated over the three trained models (two standard deviations of the error).
  • Figure 4: E$1$: Interpolation performance on one representative sample for all models trained on $t\in[0,5]$. The full trajectory $t\in[0.025,5]$ is predicted in one shot. Each subplot shows predictions (red dashed line) and the ground truth (blue solid line) row-wise at $t=\{1.25, 2.5, 3.75, 5\}$ and column-wise for all models: vanilla DeepONet, DeepONet+RNN, GRU, LSTM trained in two steps, vanilla FNO, FNO+RNN, GRU, LSTM trained in two steps, and simultaneously trained DON-RNN and FNO-RNN.
  • Figure 5: E$2$: Extrapolation performance on one representative sample for all models trained on $t\in[0,1.25]$. The full trajectory $t\in[0.025,5]$ is covered recursively as short-interval one-shot predictions, i.e., as $t=0 \rightarrow t\in[0.025,1.25]$, $t=1.25 \rightarrow t\in[1.275,2.5]$, up to $t=5$, where the last predicted solution is used as the initial condition for the next one-shot prediction over the interval $\Delta t=1.25$. Each subplot shows predictions (red dashed line) and the ground truth (blue solid line) for the last solution (the next initial condition) row-wise at $t=\{1.25, 2.5, 3.75, 5\}$ and column-wise for all models: vanilla DeepONet, DeepONet+RNN, GRU, LSTM trained in two steps, vanilla FNO, FNO+RNN, GRU, LSTM trained in two steps, and simultaneously trained DON-RNN and FNO-RNN.