Neural ensemble Kalman filter: Data assimilation for compressible flows with shocks

Xu-Hui Zhou; Lorenzo Beronilla; Michael K. Sleeman; Hangchuan Hu; Matthias Morzfeld; Andrew M. Stuart; Tamer A. Zaki

Neural ensemble Kalman filter: Data assimilation for compressible flows with shocks

Xu-Hui Zhou, Lorenzo Beronilla, Michael K. Sleeman, Hangchuan Hu, Matthias Morzfeld, Andrew M. Stuart, Tamer A. Zaki

Abstract

Data assimilation (DA) for compressible flows with shocks is challenging because many classical DA methods generate spurious oscillations and nonphysical features near uncertain shocks. We focus here on the ensemble Kalman filter (EnKF). We show that the poor performance of the standard EnKF may be attributed to the bimodal forecast distribution that can arise in the vicinity of an uncertain shock location; this violates the assumptions underpinning the EnKF, which assume a forecast which is close to Gaussian. To address this issue we introduce the new neural EnKF. The basic idea is to systematically embed neural function approximations within ensemble DA by mapping the forecast ensemble of shocked flows to the parameter space (weights and biases) of a deep neural network (NN) and to subsequently perform DA in that space. The nonlinear mapping encodes sharp and smooth flow features in an ensemble of NN parameters. Neural EnKF updates are therefore well-behaved only if the NN parameters vary smoothly within the neural representation of the forecast ensemble. We show that such a smooth variation of network parameters can be enforced via physics-informed transfer learning, and demonstrate that in so-doing the neural EnKF avoids the spurious oscillations and nonphysical features that plague the standard EnKF. The applicability of the neural EnKF is demonstrated through a series of systematic numerical experiments with an inviscid Burgers' equation, Sod's shock tube, and a two-dimensional blast wave.

Neural ensemble Kalman filter: Data assimilation for compressible flows with shocks

Abstract

Paper Structure (18 sections, 28 equations, 19 figures, 2 tables, 1 algorithm)

This paper contains 18 sections, 28 equations, 19 figures, 2 tables, 1 algorithm.

Introduction
Motivation
Ensemble Kalman filter (EnKF)
Ensemble Kalman filtering in the presence of a "shock"
Methodology
Neural EnKF
Navigating non-convex loss landscapes with progressive training
Nearest-neighbor chain training
Revisiting the motivation example
Numerical experiments with neural EnKF
Numerical experiments on the inviscid Burgers' equation
Numerical experiments on the shock tube problem
Numerical experiments on the blast wave problem
Conclusion
Neural network architectures
...and 3 more sections

Figures (19)

Figure 1: Behavior of the standard EnKF for a shock-like transition, illustrated using a hyperbolic-tangent example: (a) forecast ensemble with 50 members; (b) analysis ensemble obtained using the standard EnKF, showing spurious oscillations near the shock; and (c) marginal probability density functions (PDFs) of the forecast ensemble at two representative locations indicated in (a), showing an approximately Gaussian distribution away from the shock ($x=-0.5$) and a bimodal distribution near the shock location ($x=0$). For clarity, the PDFs in (c) are estimated using 10,000 forecast ensemble members.
Figure 2: Schematic of the neural EnKF framework: at each DA cycle, forward simulations generate a forecast ensemble $\{\mathbf{z}_i^{\mathrm{f}}(\bm{x})\}_{i=1}^{n_e}$; each forecast state is then parameterized by a deep neural network $\mathsf{F}_{\mathrm{NN}}(\bm{\theta}_i^{\mathrm{f}}; \bm{x})$, and the corresponding network parameters $\bm{\theta}_i^{\mathrm{f}}$ are updated to $\bm{\theta}_i^{\mathrm{a}}$ through assimilation of the observations $\mathbf{d}$; and finally, the updated networks reconstruct the analysis states $\{\mathbf{z}_i^{\mathrm{a}}(\bm{x})\}_{i=1}^{n_e}$ in physical space.
Figure 3: Schematic illustration of neural-network training across ensemble members on a non-convex loss landscape. (a) Independent training: each ensemble member is initialized from a different random starting point $\bm{\theta}_i^{(0)}$ and optimized independently, leading to convergence toward distinct local minima $\bm{\theta}_i^\ast$ and a poorly aligned ensemble in neural space (e.g., ensemble members 1 and 3 converge to nearby minima, whereas member 2 converges to a separated minimum). (b) Progressive training: ensemble members are initialized from the trained parameters of a nearby ensemble member (e.g., $\bm{\theta}_3^\ast$), guiding successive optimization processes toward neighboring regions of the loss landscape and promoting alignment of ensemble parameters in neural space.
Figure 4: Schematic illustration of the nearest-neighbor chain construction: (a) ensemble members in the chosen metric space; (b) initialization of the chain by selecting the ensemble medoid, defined as the member with the smallest average distance to all others; and (c)--(h) iterative construction of the chain, where at each step the ensemble member closest to the currently selected set is appended. Unselected members are shown in grey, previously selected members in light blue, and the member added at the current step is highlighted in dark blue. Solid/black lines indicate the resulting nearest-neighbor chain, with numbers denoting the order of insertion. Dashed/red arrows indicate the parent–child relationship used for training initialization, pointing from the nearest previously selected member (shown in light blue with a dark outline) to the newly added member.
Figure 5: Comparison of neural EnKF updates under two training strategies for the hyperbolic-tangent surrogate shock example. Panels (a1--c1) correspond to independent training with random initialization, while panels (a2--c2) correspond to the proposed nearest-neighbor chain training. (a1) Forecast ensemble together with the corresponding neural-network fits for each ensemble member, indicating negligible fitting errors. (b1) Analysis ensemble obtained using independent training, exhibiting irregular and spatially incoherent updates across ensemble members. (c1) Variation of neural-network parameters across ensemble members under independent training, showing highly irregular variability and poor alignment of parameters in neural space. (a2--b2) Forecast and analysis ensembles obtained using nearest-neighbor chain training, in which the analysis ensemble preserves the shock-like structure and better reflects the observation data. (c2) Smooth neural-network parameter variations under nearest-neighbor chain training, showing well-aligned parameter distributions in neural space. In panels (c1) and (c2), the horizontal axis indexes the ensemble members (50 samples), while the vertical axis indexes the flattened neural-network parameters (169 trainable parameters).
...and 14 more figures

Neural ensemble Kalman filter: Data assimilation for compressible flows with shocks

Abstract

Neural ensemble Kalman filter: Data assimilation for compressible flows with shocks

Authors

Abstract

Table of Contents

Figures (19)