Table of Contents
Fetching ...

Latent attention on masked patches for flow reconstruction

Ben Eze, Luca Magri, Andrea Nóvoa

TL;DR

The Latent Attention on Masked Patches (LAMP) model, an interpretable regression-based modified vision transformer designed for masked flow reconstruction, is introduced, providing an efficient baseline for nonlinear and high-dimensional masked flow reconstruction.

Abstract

Vision transformers have demonstrated outstanding performance on image generation applications, but their adoption in scientific disciplines, like fluid dynamics, has been limited. We introduce the Latent Attention on Masked Patches (LAMP) model, an interpretable regression-based modified vision transformer designed for masked flow reconstruction. LAMP follows a three-fold strategy: (i) partition of each flow snapshot into patches, (ii) dimensionality reduction of each patch via patch-wise proper orthogonal decomposition, and (iii) reconstruction of the full field from a masked input using a single-layer transformer trained via closed-form linear regression. We test the method on two canonical 2D unsteady wakes: a wake past a bluff body, and a chaotic wake past a flat plate. We show that the LAMP accurately reconstructs the full flow field from a 90\%-masked and noisy input, across signal-to-noise ratios between 10 and 30\,dB. Incorporating nonlinear measurement states can reduce the prediction error by up to an order of magnitude. The learned attention matrix yields physically interpretable multi-fidelity optimal sensor-placement maps. The modularity of the framework enables nonlinear compression and deep attention blocks, thereby providing an efficient baseline for nonlinear and high-dimensional masked flow reconstruction.

Latent attention on masked patches for flow reconstruction

TL;DR

The Latent Attention on Masked Patches (LAMP) model, an interpretable regression-based modified vision transformer designed for masked flow reconstruction, is introduced, providing an efficient baseline for nonlinear and high-dimensional masked flow reconstruction.

Abstract

Vision transformers have demonstrated outstanding performance on image generation applications, but their adoption in scientific disciplines, like fluid dynamics, has been limited. We introduce the Latent Attention on Masked Patches (LAMP) model, an interpretable regression-based modified vision transformer designed for masked flow reconstruction. LAMP follows a three-fold strategy: (i) partition of each flow snapshot into patches, (ii) dimensionality reduction of each patch via patch-wise proper orthogonal decomposition, and (iii) reconstruction of the full field from a masked input using a single-layer transformer trained via closed-form linear regression. We test the method on two canonical 2D unsteady wakes: a wake past a bluff body, and a chaotic wake past a flat plate. We show that the LAMP accurately reconstructs the full flow field from a 90\%-masked and noisy input, across signal-to-noise ratios between 10 and 30\,dB. Incorporating nonlinear measurement states can reduce the prediction error by up to an order of magnitude. The learned attention matrix yields physically interpretable multi-fidelity optimal sensor-placement maps. The modularity of the framework enables nonlinear compression and deep attention blocks, thereby providing an efficient baseline for nonlinear and high-dimensional masked flow reconstruction.
Paper Structure (7 sections, 5 figures)

This paper contains 7 sections, 5 figures.

Figures (5)

  • Figure 1: Pictorial illustration of the patch-wise latent attention model (LAMP). A linear autoencoder embeds the input in a lower-dimensional latent space, where predictions can be made efficiently. The predictions are pair-wise (i.e., each patch predicts every other patch) and weighted with a confidence score.
  • Figure 2: Patch-wise POD reconstruction performance for varying patch size $P$. Loss $\mathcal{L}^\mathrm{AE}$ versus (a) latent dimension $N_e$ and (b) compression factor $D/N_e$.
  • Figure 3: Reconstruction error $\mathbf{\mathcal{L}}^\mathrm{pred}$ for varying noise levels, patch size $P$ and latent dimension $N_e$. The input is noisy masked data $\mathbf{X}^\mathrm{masked} + \varepsilon$, with SNR = (a) $\infty$ (noise-free), (b) 30 dB, (c) 20 dB, (d) 10 dB. The loss is computed against the noise-free target $\mathbf{X}^\mathrm{test}$. Horizontal dashed lines show the noise variance $\sigma^2_\mathrm{noise}$. $\mathcal{L}^\mathrm{pred}$ is calculated for 25 random patch arrangements and the median is plotted. An example reconstruction for the noise-free case with $P=16$, $N_e=8$ is shown.
  • Figure 4: Predictive-power maps for varying patch size and latent dimension. Patches with higher predictive power are better at predicting the full flow-field.
  • Figure 5: Masked reconstruction on the chaotic wake. Prediction error $\mathbf{\mathcal{L}}^\mathrm{pred}$ for varying patch size and latent dimension with input channels $(u,v)$ in dotted lines, and $(u, v, uv)$ in solid lines. An example of the masked reconstruction from 2/8 unmasked patches with $P=48, N_{e}=150$, and three channels is shown.