Learning Closed-form Equations for Subgrid-scale Closures from High-fidelity Data: Promises and Challenges

Karan Jakhar; Yifei Guan; Rambod Mojgani; Ashesh Chattopadhyay; Pedram Hassanzadeh

Learning Closed-form Equations for Subgrid-scale Closures from High-fidelity Data: Promises and Challenges

Karan Jakhar, Yifei Guan, Rambod Mojgani, Ashesh Chattopadhyay, Pedram Hassanzadeh

TL;DR

This work shows that equation-discovery on high-fidelity 2D-FHIT and RBC data yields subgrid-scale closures whose form matches the nonlinear-gradient model (NGM) across common filters. The learned closures exhibit high a priori correlations but fail to deliver stable a posteriori LES, with 2D-FHIT showing zero inter-scale kinetic-energy transfer under NGM and RBC revealing incomplete backscatter of potential energy. The results reveal that the leading Taylor-series term dominates the learned closures and that the coefficients depend on the filter type/size rather than the flow, prompting calls for physics-informed loss functions and libraries. The study highlights the need to incorporate inter-scale energy transfer constraints and memory effects to obtain accurate, stable closures in multi-scale Earth-system simulations, and it provides a framework to refine equation-discovery approaches for SGS modeling.

Abstract

There is growing interest in discovering interpretable, closed-form equations for subgrid-scale (SGS) closures/parameterizations of complex processes in Earth systems. Here, we apply a common equation-discovery technique with expansive libraries to learn closures from filtered direct numerical simulations of 2D turbulence and Rayleigh-Bénard convection (RBC). Across common filters (e.g., Gaussian, box), we robustly discover closures of the same form for momentum and heat fluxes. These closures depend on nonlinear combinations of gradients of filtered variables, with constants that are independent of the fluid/flow properties and only depend on filter type/size. We show that these closures are the nonlinear gradient model (NGM), which is derivable analytically using Taylor-series. Indeed, we suggest that with common (physics-free) equation-discovery algorithms, for many common systems/physics, discovered closures are consistent with the leading term of the Taylor-series (except when cutoff filters are used). Like previous studies, we find that large-eddy simulations with NGM closures are unstable, despite significant similarities between the true and NGM-predicted fluxes (correlations $> 0.95$). We identify two shortcomings as reasons for these instabilities: in 2D, NGM produces zero kinetic energy transfer between resolved and subgrid scales, lacking both diffusion and backscattering. In RBC, potential energy backscattering is poorly predicted. Moreover, we show that SGS fluxes diagnosed from data, presumed the ''truth'' for discovery, depend on filtering procedures and are not unique. Accordingly, to learn accurate, stable closures in future work, we propose several ideas around using physics-informed libraries, loss functions, and metrics. These findings are relevant to closure modeling of any multi-scale system.

Learning Closed-form Equations for Subgrid-scale Closures from High-fidelity Data: Promises and Challenges

TL;DR

Abstract

). We identify two shortcomings as reasons for these instabilities: in 2D, NGM produces zero kinetic energy transfer between resolved and subgrid scales, lacking both diffusion and backscattering. In RBC, potential energy backscattering is poorly predicted. Moreover, we show that SGS fluxes diagnosed from data, presumed the ''truth'' for discovery, depend on filtering procedures and are not unique. Accordingly, to learn accurate, stable closures in future work, we propose several ideas around using physics-informed libraries, loss functions, and metrics. These findings are relevant to closure modeling of any multi-scale system.

Paper Structure (21 sections, 54 equations, 7 figures, 9 tables)

This paper contains 21 sections, 54 equations, 7 figures, 9 tables.

Introduction
Models, Methods, and Data
Filtering Procedure
Two-dimensional Forced Homogeneous Isotropic Turbulence (2D-FHIT)
Turbulent Rayleigh-Bénard Convection (RBC)
Filtered Direct Numerical Simulation (FDNS) Data
The Equation-discovery Method
Results
The Discovered Closures for SGS Momentum and Heat Fluxes
The Nonlinear Gradient Model (NGM): Taylor-series Expansion of the SGS Term
Effects of numerical discretization
A posteriori (Online) Tests and Inter-scale Energy/Enstrophy Transfer
A Physics-guided Library: Pope Tensors
Decomposition of SGS Fluxes: Leonard, Cross, and Reynolds Stresses
Summary and Discussion
...and 6 more sections

Figures (7)

Figure 1: Snapshots of the (a) DNS vorticity field $\omega$ ($N_{\text{DNS}}=1024$) and the (b) FDNS vorticity field $\overline{\omega}$ ($N_{\text{LES}}=128$) for Case K2 (see \ref{['table:2d-fhit cases']}). The (c) DNS temperature field $T$ ($N^{\text{DNS}}_x=2048$), and the (d) FDNS temperature field ${\overline{T}}$ ($N_{\text{LES}}=256$) for Case R3 (see \ref{['table:rbc cases']}). The Gaussian filter is applied in both cases.
Figure 2: Representative examples of the effects of increasing the sparsity-level hyper-parameter, $\alpha$, on the CC and number of terms in the discovered closure. (a), (c): $\tau_{yy}$ (2D-FHIT) and (b), (d): $J_x$ (RBC). A Gaussian filter with $N_{\text{LES}}=128$ (for Cases K1-K3) and $N_{\text{LES}}=256$ (for Cases R1-R3) is used, but the same behavior is observed with any other $N_{\text{LES}}$ and filter type (except for the sharp-spectral, see the text). In general, for small $\alpha \; (<1)$, no closure is discovered (CC=0, zero term). With increasing $\alpha$, the CC converges to $\sim 1$ (a more accurate a priori closure) but at the expense of a larger closure with many more terms (note the logarithmic scale of the $y$ axes in panels (c)-(d)). However, the CC-$\alpha$ relationship forms an "L-curve", whose elbow indicates the $\alpha$ that balances accuracy and model size (see the text).
Figure 3: Examples of the spectra of SGS fluxes predicted using NGM compared to those diagnosed using FDNS (the truth). (a) $\tau_{xy}$ from Case K1 and (b) $J_z$ from Case R1 for 3 different $N_{\rm{LES}}$. A Gaussian filter is used for FDNS, but the same behavior is observed for box and Gaussian+box filters. Here, $|\hat{\cdot}|$ is the modulus of Fourier coefficients.
Figure 4: The first row shows examples of snapshots of the SGS stress, $\tau_{xy}$, for Case K1, diagnosed from FNDS data using different filters and $N_{\text{LES}}=128$ (see \ref{['table:2d-fhit cases']}). Rows 2-4 show the three components of this $\tau_{xy}$: the Leonard stress, $L_{xy}$, cross stress, $C_{xy}$, and Reynolds stress, $R_{xy}$. Note the substantially different ranges of the colorbars.
Figure 5: The $L_2$-norm of the SGS components versus $N_{\text{LES}}$. (a) $\tau_{xx}$ from Case K3. (b) $J_z$ from Case R3. The contribution of SGS components is dependent on filter size: as $N_{\text{LES}}$ decreases, i.e., $\Delta$ increases, the relative importance of Reynolds stress (Leonard stress) increases (decreases). Norm of all the SGS components are normalized by the respective SGS flux's norm. A Gaussian filter is used, but the same behavior is observed for the box and Gauassian+box filters.
...and 2 more figures

Learning Closed-form Equations for Subgrid-scale Closures from High-fidelity Data: Promises and Challenges

TL;DR

Abstract

Learning Closed-form Equations for Subgrid-scale Closures from High-fidelity Data: Promises and Challenges

Authors

TL;DR

Abstract

Table of Contents

Figures (7)