Table of Contents
Fetching ...

A Bi-fidelity based asymptotic-preserving neural network for the semiconductor Boltzmann equation and its inverse problem

Liu Liu, Xueyu Zhu, Zhenyi Zhu

TL;DR

The paper tackles multiscale semiconductor kinetic problems governed by the Boltzmann equation, where small Knudsen numbers ε cause stiff, ill-conditioned learning dynamics for standard neural solvers. It introduces Bi-fidelity Asymptotic-Preserving Neural Networks (BI-APNNs), which decompose the macroscopic density as ρ = ρ_{diff} + ε ρ_{corr} (explicit) or ρ = ρ_{diff} + ρ_{corr} (implicit), leveraging a pretrained diffusion-like model to accelerate training and improve accuracy, especially in the fluid-dynamic limit. BI-APNNs integrate a micro-macro neural framework with a precomputed diffusion component and a small correction network, and extend to Boltzmann-Poisson systems with a separate φ-network when needed, delivering superior forward solutions and more reliable inverse parameter estimation under partial data. Theoretical convergence results and extensive numerical experiments demonstrate faster training and higher robustness for BI-APNNs across kinetic and diffusive regimes, highlighting their practical value for efficient, accurate multiscale kinetic simulations and inverse problems in semiconductor device modeling.

Abstract

This paper introduces a Bi-fidelity Asymptotic-Preserving Neural Network (BI-APNNs) framework, designed to efficiently solve forward and inverse problems for the semiconductor Boltzmann equation. Our approach builds upon the Asymptotic-Preserving Neural Network (APNNs) methodology \cite{APNN-transport}, which employs a micro-macro decomposition to handle the model's multiscale nature. We specifically address a key bottleneck in the original APNNs: the slow convergence of the macroscopic density $ρ$ in the near fluid-dynamic regime, i.e., for small Knudsen numbers $\varepsilon$. The core innovation of BI-APNNs is a novel bi-fidelity decomposition of the macroscopic quantity $ρ$, which accurately approximates the true density at small $\varepsilon$, and can be efficiently pre-trained. A separate and more compact neural network is then tasked with learning only the minor correction term, $ρ_{\text{corr}}$. This strategy not only significantly {\it accelerates} the training convergence but also improves the accuracy of the forward problem solution, particularly in the challenging fluid-dynamic limit. Meanwhile, we demonstrate through extensive numerical experiments that our new BI-APNNs yields substantially more accurate and robust results for inverse problems compared to the standard APNNs. Validated on both the semiconductor Boltzmann and the Boltzmann-Poisson systems, our work shows that the bi-fidelity formulation is a powerful enhancement for tackling multiscale kinetic equations, especially when dealing with inverse problems constrained by partial observation data.

A Bi-fidelity based asymptotic-preserving neural network for the semiconductor Boltzmann equation and its inverse problem

TL;DR

The paper tackles multiscale semiconductor kinetic problems governed by the Boltzmann equation, where small Knudsen numbers ε cause stiff, ill-conditioned learning dynamics for standard neural solvers. It introduces Bi-fidelity Asymptotic-Preserving Neural Networks (BI-APNNs), which decompose the macroscopic density as ρ = ρ_{diff} + ε ρ_{corr} (explicit) or ρ = ρ_{diff} + ρ_{corr} (implicit), leveraging a pretrained diffusion-like model to accelerate training and improve accuracy, especially in the fluid-dynamic limit. BI-APNNs integrate a micro-macro neural framework with a precomputed diffusion component and a small correction network, and extend to Boltzmann-Poisson systems with a separate φ-network when needed, delivering superior forward solutions and more reliable inverse parameter estimation under partial data. Theoretical convergence results and extensive numerical experiments demonstrate faster training and higher robustness for BI-APNNs across kinetic and diffusive regimes, highlighting their practical value for efficient, accurate multiscale kinetic simulations and inverse problems in semiconductor device modeling.

Abstract

This paper introduces a Bi-fidelity Asymptotic-Preserving Neural Network (BI-APNNs) framework, designed to efficiently solve forward and inverse problems for the semiconductor Boltzmann equation. Our approach builds upon the Asymptotic-Preserving Neural Network (APNNs) methodology \cite{APNN-transport}, which employs a micro-macro decomposition to handle the model's multiscale nature. We specifically address a key bottleneck in the original APNNs: the slow convergence of the macroscopic density in the near fluid-dynamic regime, i.e., for small Knudsen numbers . The core innovation of BI-APNNs is a novel bi-fidelity decomposition of the macroscopic quantity , which accurately approximates the true density at small , and can be efficiently pre-trained. A separate and more compact neural network is then tasked with learning only the minor correction term, . This strategy not only significantly {\it accelerates} the training convergence but also improves the accuracy of the forward problem solution, particularly in the challenging fluid-dynamic limit. Meanwhile, we demonstrate through extensive numerical experiments that our new BI-APNNs yields substantially more accurate and robust results for inverse problems compared to the standard APNNs. Validated on both the semiconductor Boltzmann and the Boltzmann-Poisson systems, our work shows that the bi-fidelity formulation is a powerful enhancement for tackling multiscale kinetic equations, especially when dealing with inverse problems constrained by partial observation data.

Paper Structure

This paper contains 24 sections, 2 theorems, 64 equations, 12 figures, 11 tables.

Key Result

Lemma 1

Suppose the solution to Boltz-eqn satisfies $f \in C^1([0,T]) \cap C^1(\mathcal{D}) \cap C^1(\Omega)$. Let the activation function $\bar{\sigma}$ be any non-polynomial function in $C^1(\mathbb R)$, then for any $\delta>0$, there exists a two-layer neural network such that where the domain $K$ denotes $[0,T]\times\mathcal{D}\times\Omega$.

Figures (12)

  • Figure 1: Illustration of APNNs.
  • Figure 2: Framework of BI-APNNs for the forward problem. Here $\sigma(x)$ denotes the activation function.
  • Figure 3: Framework of BI-APNNs for the inverse problem.
  • Figure 4: Problems I with different $\varepsilon$. Density $\rho$ for PINNs, APNNs, BI-APNNs, and reference solutions at $T=0.1$.
  • Figure 5: Problem I with $\varepsilon=1$ and $\varepsilon = 10^{-8}$. Comparison of training losses between Bi-APNNs and APNNs.
  • ...and 7 more figures

Theorems & Definitions (3)

  • Lemma 1
  • Theorem 1: Convergence of BI-APNNs
  • proof