Table of Contents
Fetching ...

Deep Learning of the Evolution Operator Enables Forecasting of Out-of-Training Dynamics in Chaotic Systems

Ira J. S. Shokar, Peter H. Haynes, Rich R. Kerswell

Abstract

We demonstrate that a deep learning emulator for chaotic systems can forecast phenomena absent from training data. Using the Kuramoto-Sivashinsky and beta-plane turbulence models, we evaluate the emulator through scenarios probing the fundamental phenomena of both systems: forecasting spontaneous relaminarisation, capturing initialisation of arbitrary chaotic states, zero-shot prediction of dynamics with parameter values outside of the training range, and characterisation of dynamical statistics from artificially restricted training datasets. Our results show that deep learning emulators can uncover emergent behaviours and rare events in complex systems by learning underlying mathematical rules, rather than merely mimicking observed patterns.

Deep Learning of the Evolution Operator Enables Forecasting of Out-of-Training Dynamics in Chaotic Systems

Abstract

We demonstrate that a deep learning emulator for chaotic systems can forecast phenomena absent from training data. Using the Kuramoto-Sivashinsky and beta-plane turbulence models, we evaluate the emulator through scenarios probing the fundamental phenomena of both systems: forecasting spontaneous relaminarisation, capturing initialisation of arbitrary chaotic states, zero-shot prediction of dynamics with parameter values outside of the training range, and characterisation of dynamical statistics from artificially restricted training datasets. Our results show that deep learning emulators can uncover emergent behaviours and rare events in complex systems by learning underlying mathematical rules, rather than merely mimicking observed patterns.

Paper Structure

This paper contains 4 sections, 3 equations, 7 figures.

Figures (7)

  • Figure 1: Prediction of out-of-training-distribution dynamics of the KS equation with $L=56$, where training data excludes relaminarisation events or warm-up dynamics. (a) Relaminarisation event observed in direct numerical simulation (DNS). (b) Neural network (NN) emulation predicts the relaminarisation event from identical initial conditions. (c) Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE) between the numerical simulation (a) and NN prediction (b). (d) Initialisation dynamics of KS flow from DNS. (e) NN emulation forecasts the correct dynamics from the same initial conditions. (f) MAE and RMSE between the initialisation dynamics (d) and NN prediction (e). The largest Lyapunov exponent for $L=56$ is $\Lambda_{\text{max}} \approx 0.048$.
  • Figure 2: Transition probabilities for kink count evolution in the KS equation, where the number of kinks (localised steep gradients) at time $t$ determines whether the count increases (red), decreases (blue), or remains constant (green) at time $t+\Delta t$. (a) Comparison of kink count distributions for a restricted training dataset limited to four or fewer kinks, showing the distributions from full KS dynamics (solid lines) and the NN emulator trained on the restricted dataset (dashed lines). (b) Same as (a), but with the restricted dataset excluding all transition events involving kink count changes.
  • Figure 3: Prediction of the Kuramoto-Sivashinsky (KS) equation dynamics with $L=400$, extrapolated beyond the training set. (a) Numerical integration of the KS equation with $L=400$. (b) NN emulation of the KS dynamics, with the emulator pretrained on data for $L=22$ and fine-tuned with datasets for $L \in \{48, 64, 96, 128, 164, 200\}$. (c) MAE and RMSE comparing the numerical simulation in (a) to the NN prediction in (b).
  • Figure 4: Prediction of out-of-training-distribution dynamics for beta-plane turbulence, where the training dataset includes only states with 3 jets. (a) Zonally-averaged zonal velocity $U_1$ from the upper layer, obtained via DNS. (b) $\tilde{U_1}$ predicted by the NN from the same initial condition as (a), with the network trained on a dataset including only 3 jets states, with no transition events. (c) Transition probabilities for the number of jets, as in Fig. 2, comparing DNS (solid lines) for the full system and the NN trained on the restricted dataset (dashed lines).
  • Figure 5: Schematic of the NN architecture. The network is structured around a transformer architecture conditioned on parameter $L$ and initialised by conditions $U$. Within each transformer block, adaptive layer normalisation conditions the transformer on $L$ by replacing scale and shift parameters. Each $W$ denotes learned weights for linear transformations, with arrows indicating the forward pass. In this study, the conditioning parameter $L$ is defined with a size of $M=1$ and the temporal history provided to the model $S=1$. However, the architecture is flexible to extension to larger dimensions within the parameter space.
  • ...and 2 more figures