Table of Contents
Fetching ...

Can Transformers overcome the lack of data in the simulation of history-dependent flows?

P. Urdeitx, I. Alfaro, D. Gonzalez, F. Chinesta, E. Cueto

TL;DR

This work investigates whether Transformer architectures can compensate for missing history-dependent variables in complex fluid dynamics by operating in a thermodynamically structured latent space. A three-stage framework first embeds observable states into a GENERIC-based latent manifold trained with a metriplectic integrator, then compares a structure-preserving neural network against a Transformer that evolves latent states in a sequence. Across three benchmarks—flow past a cylinder, Oldroyd-B Couette flow, and FENE polymeric fluid—the Transformer generally outperforms the incomplete SPNN when variables like the conformation tensor are unavailable, particularly in memory-driven, nonlinear regimes, while fully observed, history-free cases favor the SPNN baseline. The study demonstrates that attention mechanisms can implicitly recover historical dependencies, reducing reliance on hard-to-measure internal variables and offering a path toward robust, data-efficient modeling of non-Markovian dynamics.

Abstract

It is well known that the lack of information about certain variables necessary for the description of a dynamical system leads to the introduction of historical dependence (lack of Markovian character of the model) and noise. Traditionally, scientists have made up for these shortcomings by designing phenomenological variables that take into account this historical dependence (typically, conformational tensors in fluids). Often, these phenomenological variables are not easily measurable experimentally. In this work, we study to what extent Transformer architectures are able to cope with the lack of experimental data on these variables. The methodology is evaluated on three benchmark problems: a cylinder flow with no history dependence, a viscoelastic Couette flow modeled via the Oldroyd-B formalism, and a non-linear polymeric fluid described by the FENE model. Our results show that the Transformer outperforms a thermodynamically consistent, structure-preserving neural network with metriplectic bias in systems with missing experimental data, providing lower errors even in low-dimensional latent spaces. In contrast, for systems whose state variables can be fully known, the metriplectic model achieves superior performance.

Can Transformers overcome the lack of data in the simulation of history-dependent flows?

TL;DR

This work investigates whether Transformer architectures can compensate for missing history-dependent variables in complex fluid dynamics by operating in a thermodynamically structured latent space. A three-stage framework first embeds observable states into a GENERIC-based latent manifold trained with a metriplectic integrator, then compares a structure-preserving neural network against a Transformer that evolves latent states in a sequence. Across three benchmarks—flow past a cylinder, Oldroyd-B Couette flow, and FENE polymeric fluid—the Transformer generally outperforms the incomplete SPNN when variables like the conformation tensor are unavailable, particularly in memory-driven, nonlinear regimes, while fully observed, history-free cases favor the SPNN baseline. The study demonstrates that attention mechanisms can implicitly recover historical dependencies, reducing reliance on hard-to-measure internal variables and offering a path toward robust, data-efficient modeling of non-Markovian dynamics.

Abstract

It is well known that the lack of information about certain variables necessary for the description of a dynamical system leads to the introduction of historical dependence (lack of Markovian character of the model) and noise. Traditionally, scientists have made up for these shortcomings by designing phenomenological variables that take into account this historical dependence (typically, conformational tensors in fluids). Often, these phenomenological variables are not easily measurable experimentally. In this work, we study to what extent Transformer architectures are able to cope with the lack of experimental data on these variables. The methodology is evaluated on three benchmark problems: a cylinder flow with no history dependence, a viscoelastic Couette flow modeled via the Oldroyd-B formalism, and a non-linear polymeric fluid described by the FENE model. Our results show that the Transformer outperforms a thermodynamically consistent, structure-preserving neural network with metriplectic bias in systems with missing experimental data, providing lower errors even in low-dimensional latent spaces. In contrast, for systems whose state variables can be fully known, the metriplectic model achieves superior performance.

Paper Structure

This paper contains 11 sections, 26 equations, 9 figures.

Figures (9)

  • Figure 1: Integration scheme for (a) the Metriplectic Neural Network, and (b) the Transformer-based model. In both cases, the physical state is first mapped to a latent representation via an encoder. In (a), the evolution is computed through a learned metriplectic formalism based on the GENERIC structure, and then decoded back to physical space. In (b), the model receives as input a context window of physical states $\boldsymbol{Z} = \{ \boldsymbol{z}_{n-s}, \dots, \boldsymbol{z}_{n} \}$, which is encoded and passed to the Transformer integrator to predict the next latent state $\boldsymbol{\xi}_{n+1}$. This is decoded to recover the next physical state $\boldsymbol{z}_{n+1}$.
  • Figure 2: Training scheme for (a) the Metriplectic neural network, and (b) the Transformer-based architecture. In both cases, the physical state $\boldsymbol{z} \in \mathbb{R}^{D}$ is first encoded into a low-dimensional latent representation $\boldsymbol{\xi} \in \mathbb{R}^{d}$. The integration stage then governs the system evolution based on (a) the metriplectic formalism or (b) a context-based Transformer. In the latter case, since the integrator operates on a sequence of latent vectors, the input is derived from a window of physical states $\boldsymbol{Z} = \{ \boldsymbol{z}_{n-s}, \dots, \boldsymbol{z}_{n} \}$.
  • Figure 3: Last snapshot from the rollout results of the flow around a cylinder, in a validation case. (a) The metriplectic neural network shows good agreement with the ground truth. (b) After 398 snapshots, the transformer exhibits a delay with respect to the ground truth.
  • Figure 4: RRMSE for rollout reconstruction on the flow around a cylinder validation dataset: (a) Metriplectic, (b) Transformer.
  • Figure 5: Last snapshot from the Oldroyd-B model rollout of a validation case. The system variables are represented along the 200 points of the discretized section (Y axis). The predictions are represented with a blue solid line, while the ground truth is a black dashed line. The results show a good level of agreement between the ground truth and the predictions obtained by (a) the Metriplectic* neural network and (b) the Transformer model.
  • ...and 4 more figures