Table of Contents
Fetching ...

Stochastic Reconstruction of Gappy Lagrangian Turbulent Signals by Conditional Diffusion Models

Tianyi Li, Luca Biferale, Fabio Bonaccorso, Michele Buzzicotti, Luca Centurioni

Abstract

We present a stochastic method for reconstructing missing spatial and velocity data along the trajectories of small objects passively advected by turbulent flows with a wide range of temporal or spatial scales, such as small balloons in the atmosphere or drifters in the ocean. Our approach makes use of conditional generative diffusion models, a recently proposed data-driven machine learning technique. We solve the problem for two paradigmatic open problems, the case of 3D tracers in homogeneous and isotropic turbulence, and 2D trajectories from the NOAA-funded Global Drifter Program. We show that for both cases, our method is able to reconstruct velocity signals retaining non-trivial scale-by-scale properties that are highly non-Gaussian and intermittent. A key feature of our method is its flexibility in dealing with the location and shape of data gaps, as well as its ability to naturally exploit correlations between different components, leading to superior accuracy, with respect to Gaussian process regressions, for both pointwise reconstruction and statistical expressivity. Our method shows promising applications also to a wide range of other Lagrangian problems, including multi-particle dispersion in turbulence, dynamics of charged particles in astrophysics and plasma physics, and pedestrian dynamics.

Stochastic Reconstruction of Gappy Lagrangian Turbulent Signals by Conditional Diffusion Models

Abstract

We present a stochastic method for reconstructing missing spatial and velocity data along the trajectories of small objects passively advected by turbulent flows with a wide range of temporal or spatial scales, such as small balloons in the atmosphere or drifters in the ocean. Our approach makes use of conditional generative diffusion models, a recently proposed data-driven machine learning technique. We solve the problem for two paradigmatic open problems, the case of 3D tracers in homogeneous and isotropic turbulence, and 2D trajectories from the NOAA-funded Global Drifter Program. We show that for both cases, our method is able to reconstruct velocity signals retaining non-trivial scale-by-scale properties that are highly non-Gaussian and intermittent. A key feature of our method is its flexibility in dealing with the location and shape of data gaps, as well as its ability to naturally exploit correlations between different components, leading to superior accuracy, with respect to Gaussian process regressions, for both pointwise reconstruction and statistical expressivity. Our method shows promising applications also to a wide range of other Lagrangian problems, including multi-particle dispersion in turbulence, dynamics of charged particles in astrophysics and plasma physics, and pedestrian dynamics.

Paper Structure

This paper contains 2 sections, 16 equations, 8 figures, 1 table.

Figures (8)

  • Figure 1: (a1) Setup of the Lagrangian turbulent signal reconstruction. In this example, the goal is to reconstruct the missing observation of one generic velocity component, $V_i(t)$, of a 3D turbulent tracer. We assume that there is missing data within a large time window in the middle of the trajectory (region denoted as $G$), while the beginning and end chunks are assumed to be measured and known (regions denoted as $M$). Once trained, our conditional diffusion model (C-DM) reconstructs the signal within the gap through a backward multi-step denoising process, starting from a pure uncorrelated Gaussian guess in the region $G$ at the beginning of the process $n=N$ (top row), gradually generating a denoised signal conditioned on the data measure in the regions $M$ (middle row), and ending with the final realistic guess at the last iteration, $n=0$ (bottom row). Panel (a2) shows a 3D representation of the gappy trajectory for visualisation purposes. (b) Standardized probability density functions (PDFs) of a generic component of the velocity increment, $\delta_\tau V_i$, defined in Eq.(\ref{['eq:deltav']}), for different time lags $\tau/\tau_\eta=0.5,2,100$ (from bottom to top) for both ground-truth DNS data (black lines) and reconstructed data from the C-DM (green solid circles) for a central gap of size $50\tau_\eta$. PDFs for different $\tau$ are shifted vertically for clarity. The PDF is Gaussian for large time lags and develops progressively fatter tails as $\tau$ decreases, illustrating the non-trivial intermittent statistical properties of the Lagrangian turbulence dataset. (c) Ocean surface drifter trajectories elipot2022hourly, with three specific regions where trajectories are colored by their total kinetic energy $\bar{E}_v$: (A) two Western Boundary Currents (WBCs), the Kuroshio Current and Gulf Stream (blue contours); (B) the Tropics (TRO) (green contour); and (C) the Antarctic Circumpolar Current (ACC) (orange contour). Trajectories outside these regions are shown in gray. (d) Standardized PDFs of the eastward velocity for the three regions from panel (c), based on observations and C-DM reconstructions, with a central gap of size $360\tau_0$. Observational data (Obs) are shown as blue solid, green dashed, and orange dash-dotted lines, while C-DM reconstructions are shown as blue circles, green triangles, and orange squares for regions A, B and C, respectively. Here, $\sigma$ represents the standard deviation computed from the ground-truth dataset.
  • Figure 2: Geometries of (a1) a central gap (CG) and (a2) a right-end gap (RG), with gap regions indicated in gray. (b) Plot of the overall mean squared error (MSE), $\langle\bar{\Delta}\rangle$, for different gaps of sizes $T_g$ for 3D Lagrangian turbulence reconstruction. Results are shown for one generic component (1c) using C-DM (green bars) and Gaussian process regression (GPR, purple bars). Right-end gaps are shown with diagonal hatching, while central gaps are shown without hatching. In addition, for a central gap of size $50\tau_\eta$, the result for 3-components (3c) are also shown, with cross-hatching for C-DM (green) and GPR (purple). (c) Similar to panel b, but for ocean drifter observations with central gaps. The 1c case is shown without hatching, while the 2-component (2c) case is shown with cross-hatching. (d) PDFs of the MSE for a single configuration, $\bar{\Delta}$, obtained from C-DM and GPR for 1c and 3c cases, for a central gap of size $50\tau_\eta$ in Lagrangian turbulence reconstruction. (e) The MSE, $\langle \Delta (t) \rangle$ as a function of time within the gap, for Lagrangian turbulence reconstruction using C-DM and GPR for 1c and 3c cases, with a central gap of size $50\tau_\eta$. Here $t_g$ represents the relative time position from the left gap edge, as shown in panel a. (f) Similar to panel e, but for ocean drifter observations with a central gap of size $360\tau_0$. Error bars represent the minimum and maximum values obtained for different velocity components.
  • Figure 3: (a,b) Scatter plots of the maximum acceleration magnitude in a central gap of size $50\tau_\eta$, comparing the ground truth with reconstructions from (a) C-DM and (b) GPR. Colors represent the density of points in the scatter plot. Results are based on 32,768 test data, with one realization of the stochastic reconstructions for each configuration. Three specific configurations (C1, C2, C3), highlighted by red circles, are shown in (c). (d-f) PDFs of the maximum acceleration magnitude in the gap for the three fixed configurations: (d) C1, (e) C2, and (f) C3, from C-DM and GPR, with the ground truth DNS value marked by a vertical black line.
  • Figure 4: (a) The fourth-order flatness, $F_\tau^{(4)}$, for 3D tracers from the ground truth DNS, C-DM and GPR reconstructions with a central gap of size $50\tau_\eta$. (b) $F_\tau^{(4)}$ for ocean drifter observations (Obs) with a central gap of size $360\tau_0$. (c) Same as panel b, but comparing Obs and C-DM reconstructions for fully drogued (top two) and undrogued (bottom two) drifters. (d-f) Regional $F_\tau^{(4)}$ conditioned on trajectories from the WBC (d), TRO (e) and ACC (f) regions, corresponding to regions A, B and C in Fig.\ref{['fig:setup']}c, respectively. Error bars are estimated from the spread between different velocity components.
  • Figure 5: (a) Standardized PDFs of a generic component of acceleration, $a_i$, for ground-truth DNS data (black line), C-DM reconstructed data (green solid circles) and GPR reconstructed data (purple hollow triangles) within a central gap of size $50\tau_\eta$ for Lagrangian turbulence reconstruction. (b) Similar to panel a, but for ocean drifter observations with a central gap of size $360\tau_0$.
  • ...and 3 more figures