Table of Contents
Fetching ...

Multi-Scale Data Assimilation in Turbulent Models

Francesco Fossella, Luca Biferale, Alberto Carrassi, Massimo Cencini, Vikrant Gupta

TL;DR

The paper addresses reconstructing unobserved, strongly intermittent scales in chaotic, multiscale turbulence from sparse mesoscale measurements by applying the Ensemble Kalman Filter to a Sabra shell model. It introduces a real-valued extended-state formulation and a scale-aware inflation strategy to stabilize the filter and correctly propagate corrections across scales. The results show that observing two adjacent mesoscales at a frequency faster than the turnover time of the observed scales yields near-complete synchronization of larger and smaller scales, outperforming Nudging and matching En4D-Var at lower computational cost. These findings provide practical guidance for design of data-assimilation systems in turbulent contexts and point toward extensions to more realistic flows and hybrid data-assimilation approaches.

Abstract

We explore the potential of Data-Assimilation (DA) within the multi-scale framework of a shell model of turbulence, with a focus on the Ensemble Kalman Filter (EnKF). The central objective is to understand how measuring mesoscales (i.e., inertial-range scales) enhances the prediction of both large-scale and small-scale intermittent variables, by systematically varying observation frequency and the set of measured scales. We demonstrate that measurements conducted at frequencies that exceed those of the observed scales enable full synchronization of larger scales, provided that at least two adjacent mesoscale are measured. In addition, we benchmark the EnKF against two other DA methods, namely Nudging and Ensemble 4D-Var. EnKF is clearly superior to the former, and comparable with the latter but achieving the result with a lower computational complexity. Moreover, our results underscore the need for a tailored, scale-aware inflation technique to stabilize the assimilation process, preventing filter divergence and ensuring robust convergence.

Multi-Scale Data Assimilation in Turbulent Models

TL;DR

The paper addresses reconstructing unobserved, strongly intermittent scales in chaotic, multiscale turbulence from sparse mesoscale measurements by applying the Ensemble Kalman Filter to a Sabra shell model. It introduces a real-valued extended-state formulation and a scale-aware inflation strategy to stabilize the filter and correctly propagate corrections across scales. The results show that observing two adjacent mesoscales at a frequency faster than the turnover time of the observed scales yields near-complete synchronization of larger and smaller scales, outperforming Nudging and matching En4D-Var at lower computational cost. These findings provide practical guidance for design of data-assimilation systems in turbulent contexts and point toward extensions to more realistic flows and hybrid data-assimilation approaches.

Abstract

We explore the potential of Data-Assimilation (DA) within the multi-scale framework of a shell model of turbulence, with a focus on the Ensemble Kalman Filter (EnKF). The central objective is to understand how measuring mesoscales (i.e., inertial-range scales) enhances the prediction of both large-scale and small-scale intermittent variables, by systematically varying observation frequency and the set of measured scales. We demonstrate that measurements conducted at frequencies that exceed those of the observed scales enable full synchronization of larger scales, provided that at least two adjacent mesoscale are measured. In addition, we benchmark the EnKF against two other DA methods, namely Nudging and Ensemble 4D-Var. EnKF is clearly superior to the former, and comparable with the latter but achieving the result with a lower computational complexity. Moreover, our results underscore the need for a tailored, scale-aware inflation technique to stabilize the assimilation process, preventing filter divergence and ensuring robust convergence.

Paper Structure

This paper contains 14 sections, 35 equations, 11 figures, 3 tables.

Figures (11)

  • Figure 1: Multi-scale character of the Sabra Model (a) Instantaneous estimate of characteristic turnover times. During one full oscillation of the slow shell $n = 4$, the fast shell $n = 15$---representative of the dissipative range---undergoes several hundred oscillations. (b) Quantitative visualization of the natural frequencies involved. The observation intervals examined are highlighted: $\Delta t_{\text{obs}} = \tau_{15}$ yields quasi-continuous inertial-range measurements, $\Delta t_{\text{obs}} = \tau_{9}$ represents an intermediate regime, and $\Delta t_{\text{obs}} = \tau_{4}$ corresponds to discrete sampling. (c) Normalized probability-density functions of shell amplitudes. Slow shells display nearly Gaussian statistics, whereas fast shells exhibit heavy-tailed, intermittent behavior.
  • Figure 2: The ensemble (blue lines) starts as a prior distribution around the ground truth (black line) for both an observed variable $\hat{U}_m^{(j)}$ (red dots: measurements $Z_m$) and an unobserved one $\hat{U}_n^{(j)}$. Only two variables and $L=4$ ensemble members are shown. Due to chaotic divergence, the ensemble spread increases over time. At each assimilation step, measurements are incorporated to produce the posterior $\tilde{U}_m^{(j)}$ (green circles), which serves as the initial condition for the next cycle. The Kalman update propagates observational information to unmeasured variables, reducing their uncertainty as well.
  • Figure 3: (a) Real parts of the velocity components for shells $n=1$, $6$, and $13$ from an experiment where $u_6$, $u_7$, and $u_8$ are measured with observation interval $\Delta t_{\text{obs}} = \tau_{15} = 0.002\tau_0$. The figure displays the full ensemble (light blue), its mean (solid blue line), and the ground truth (solid black line), over a total experiment duration of $20\tau_0$. (b) Probability density function (PDF) of the normalized real-part error, $\frac{\Re(u_n-\tilde{u}_n)^2}{\sqrt{\langle\Re(u_n)^2\rangle_T}\sqrt{\langle\Re(\tilde{u}_n)^2\rangle_T}}$, and (c) PDF of the phase difference, $\theta_n - \tilde{\theta}_n$, shown for all ensemble members (shaded area) and their mean (blue line) for the selected shells. The PDFs in (b) and (c) are computed by collecting data only after the transient phase---i.e., after saturation is reached (as indicated by the red vertical line in panel (a))---and up to the end of the $20\tau_0$ experiment. The light-green shaded zone highlights the time window $12\leq t/\tau_0\leq 14$ detailed in \ref{['fig:all_together']}.
  • Figure 4: (a) Ground truth energy spectrum $E_n = \langle |u_n|^2 \rangle_T$, EnKF estimate $\tilde{E}_n = \langle |\tilde{u}_n|^2 \rangle_{T,L}$, and $\ell^2$-error $\mathcal{E}_n = \langle |u_n - \tilde{u}_n|^2 \rangle_{T,L}$. The grey shaded area highlights the observed shells, while the vertical solid line marks the scale for which the turnover time matches the observation interval, $\tau_{15} = \Delta t_{\textit{obs}}$. (b) Normalized error $\mathcal{E}_n / \sqrt{E_n \tilde{E}_n}$, with the dashed line indicating the scale beyond which statistical consistency ($E_n = \tilde{E}_n$) is maintained, but the EnKF no longer achieves effective reconstruction. All quantities in panels (a) and (b), including error bars, are averaged over $N_{\text{exp}} = 16$ independent experiments, as described in \ref{['sec:experimental_setup']}. (c) Normalized energy evolution for inertial-range scales during the stationary synchronization regime ($12 \leq t / \tau_0 \leq 14$; see light-green shaded region in \ref{['fig:dinamic_evol_enkf']}). Shown are the ground truth values $|u_n|^2 k_n^{2/3}$ (left), one randomly selected posterior ensemble member $|\tilde{u}_n|^2 k_n^{2/3}$ (middle), and their difference (right). Red vertical lines indicate the measured shells ($n=6$, $7$, and $8$). (d) Cumulative instantaneous energy flux evolution over the same stationary synchronization regime, shown for the ground truth (left), one randomly selected posterior ensemble member (middle), and their difference (right).
  • Figure 5: (a) Scatter plot of the mean errors $\langle|u_{13}-\tilde{u}_{13}|^2\rangle_L$ versus the energy $|u_{13}|^2$ at scale $n=13$, including the statistical independence baseline given by $\langle|u_{13}-\tilde{u}_{13}|^2\rangle_L=2|u_{13}|^2$, and five highlighted specific cases. Colored vertical lines through each point illustrate the full distribution of ensemble errors $|u_n-\tilde{u}_n^{(j)}|^2$, providing insight into variability around the mean value. (b) Box plots for the highlighted points in Panel (a). In each box plot, the box spans the interquartile range (from the 25th to the 75th percentile), with the horizontal line inside indicating the median. The vertical whiskers extend to the next percentiles (representing the 10th and 90th percentiles, or the most extreme non-outlier values), while black dots denote outliers. The results shown in these panels were collected from an experiment lasting $20\tau_0$.
  • ...and 6 more figures