Table of Contents
Fetching ...

Hyperbolic Machine Learning Moment Closures for the BGK Equations

Andrew J. Christlieb, Mingchang Ding, Juntao Huang, Nicholas A. Krupansky

TL;DR

This work develops hyperbolicity-preserving neural network closures for the Grad moment expansion of the BGK equation. It learns gradients of the unclosed highest moment to close the system, enforcing hyperbolicity and Galilean invariance while solving with a FORCE-based non-conservative scheme. Two closure paradigms are explored: (i) a Hyperbolic Moment Equation (HME) closure trained on HME data, and (ii) a kinetic BGK closure trained on high-fidelity DVM data, with results showing accurate behavior across Knudsen numbers within training windows and reasonable extrapolation beyond. The approach offers a local, data-driven closure that captures kinetic effects more faithfully than local equilibrium assumptions and demonstrates practical robustness for multi-scale transport problems. Potential extensions include enhanced regularization, transfer learning from HME to kinetic closures, and development of a symmetrizer for BGK-type systems.

Abstract

We introduce a hyperbolic closure for the Grad moment expansion of the Bhatnagar-Gross-Krook's (BGK) kinetic model using a neural network (NN) trained on BGK's moment data. This closure is motivated by the exact closure for the free streaming limit that we derived in our paper on closures in transport \cite{Huang2022-RTE1}. The exact closure relates the gradient of the highest moment to the gradient of four lower moments. As with our past work, the model presented here learns the gradient of the highest moment in terms of the coefficients of gradients for all lower ones. By necessity, this means that the resulting hyperbolic system is not conservative in the highest moment. For stability, the output layers of the NN are designed to enforce hyperbolicity and Galilean invariance. This ensures the model can be run outside of the training window of the NN. Unlike our previous work on radiation transport that dealt with linear models, the BGK model's nonlinearity demanded advanced training tools. These comprised an optimal learning rate discovery, one cycle training, batch normalization in each neural layer, and the use of the \texttt{AdamW} optimizer. To address the non-conservative structure of the hyperbolic model, we adopt the FORCE numerical method to achieve robust solutions. This results in a comprehensive computing model combining learned closures with methods for solving hyperbolic models. The proposed model can capture accurate moment solutions across a broad spectrum of Knudsen numbers. Our paper details the multi-scale model construction and is run on a range of test problems.

Hyperbolic Machine Learning Moment Closures for the BGK Equations

TL;DR

This work develops hyperbolicity-preserving neural network closures for the Grad moment expansion of the BGK equation. It learns gradients of the unclosed highest moment to close the system, enforcing hyperbolicity and Galilean invariance while solving with a FORCE-based non-conservative scheme. Two closure paradigms are explored: (i) a Hyperbolic Moment Equation (HME) closure trained on HME data, and (ii) a kinetic BGK closure trained on high-fidelity DVM data, with results showing accurate behavior across Knudsen numbers within training windows and reasonable extrapolation beyond. The approach offers a local, data-driven closure that captures kinetic effects more faithfully than local equilibrium assumptions and demonstrates practical robustness for multi-scale transport problems. Potential extensions include enhanced regularization, transfer learning from HME to kinetic closures, and development of a symmetrizer for BGK-type systems.

Abstract

We introduce a hyperbolic closure for the Grad moment expansion of the Bhatnagar-Gross-Krook's (BGK) kinetic model using a neural network (NN) trained on BGK's moment data. This closure is motivated by the exact closure for the free streaming limit that we derived in our paper on closures in transport \cite{Huang2022-RTE1}. The exact closure relates the gradient of the highest moment to the gradient of four lower moments. As with our past work, the model presented here learns the gradient of the highest moment in terms of the coefficients of gradients for all lower ones. By necessity, this means that the resulting hyperbolic system is not conservative in the highest moment. For stability, the output layers of the NN are designed to enforce hyperbolicity and Galilean invariance. This ensures the model can be run outside of the training window of the NN. Unlike our previous work on radiation transport that dealt with linear models, the BGK model's nonlinearity demanded advanced training tools. These comprised an optimal learning rate discovery, one cycle training, batch normalization in each neural layer, and the use of the \texttt{AdamW} optimizer. To address the non-conservative structure of the hyperbolic model, we adopt the FORCE numerical method to achieve robust solutions. This results in a comprehensive computing model combining learned closures with methods for solving hyperbolic models. The proposed model can capture accurate moment solutions across a broad spectrum of Knudsen numbers. Our paper details the multi-scale model construction and is run on a range of test problems.
Paper Structure (22 sections, 5 theorems, 75 equations, 9 figures, 6 tables, 2 algorithms)

This paper contains 22 sections, 5 theorems, 75 equations, 9 figures, 6 tables, 2 algorithms.

Key Result

Theorem 2.1

\newlabeltheorem:Galilean-invariance0 Consider the following moment closure system without collision terms where $\mathbf{w} = (\rho, u, \theta, f_3, f_4,\cdots,f_M)^T \in \mathbb{R}^{M+1}$. The system is Galilean invariant if and only if $(A(\mathbf{w})-uI)$ is independent of $u$ where $I$ is the identity matrix.

Figures (9)

  • Figure 1: Neural network architecture from the left: the light blue $f_k$ are moments from the training data set. The boxed, gray circles with $\sigma$ represent the fully connected neural network, each column should be thought of as a block of layers (linear, activation, batch normalization). The yellow block applies Vieta's formula and the final green appropriately transforms the output to coefficients for the moment gradients.
  • Figure 1: Moments calculated using a kinetic trained NN without hyperbolic method and with the hyperbolic method. This example has a Knudsen number of 0.001. This prediction is at $t=0.3$.
  • Figure 2: Examples of learning rate range tests, with the effects of batch normalization, and a sample one-cycle learning rate schedule. \ref{['fig:LRRTsub']} Representative learning rate range for a 9 layer, 256 neuron wide network that is trained on kinetic smooth/wave initial conditions. The horizontal axes are the learning rate and the vertical axes are the loss. Each color represents the moment that is being closed by the network. On the left, the networks have no batch normalization layer, and on the right the networks do. We can see that batch normalization (right) gave overlapping regions of loss decrease that were spread over a wider range of learning rates compared to networks without. \ref{['fig:OCLRsub']} An example of the one cycle learning rate variation. The horizontal axis is the epoch of training and the vertical axis is the learning rate.
  • Figure 2: Galilean invariance using a kinetic trained closure for the first three moments, $\rho,u,\theta$, for Knudsen number of 0.063. The NN closure was trained with a macroscopic velocity of $u=0$ and was used to predict for an initial condition with $u=0.1$. This prediction is at $t=0.3$.
  • Figure 3: HME trained closure: Sample of five moments ($\rho, u, \theta, f_3, f_4$) for Knudsen number $\tau=1$ at $t=10$, beyond the training data time range. trained on smooth, wave data then used to predict moments on smooth initial conditions.
  • ...and 4 more figures

Theorems & Definitions (9)

  • Theorem 2.1
  • Proof 1
  • Corollary 2.1
  • Definition 1: lower Hessenberg matrix Huang-RTE3
  • Definition 2: associated polynomial sequence Elouafi2009Huang-RTE3
  • Theorem 3: Elouafi2009Huang-RTE3
  • Theorem 4: Huang-RTE3
  • Theorem 5
  • Proof 2