Table of Contents
Fetching ...

DeepRTE: Pre-trained Attention-based Neural Network for Radiative Transfer

Yekun Zhu, Min Tang, Zheng Ma

TL;DR

DeepRTE introduces a physics-informed, attention-based neural operator to solve the steady-state radiative transfer equation (RTE) by learning a Green's-function–based operator. It decomposes the operator into attenuation (along characteristics) and scattering components, implemented as Attenuation and Scattering modules, and pretrains with delta-function inflow to enable zero-shot and transferable performance. Across three scattering regimes, DeepRTE achieves high accuracy with far fewer parameters than baselines and exhibits strong linearity and transfer capabilities, including zero-shot generalization to unseen boundary conditions. The approach delivers significant speedups over deterministic solvers while preserving physical interpretability, making it a scalable, mesh-free alternative for high-dimensional radiative transport problems.

Abstract

In this paper, we propose a novel neural network approach, termed DeepRTE, to address the steady-state Radiative Transfer Equation (RTE). The RTE is a differential-integral equation that governs the propagation of radiation through a participating medium, with applications spanning diverse domains such as neutron transport, atmospheric radiative transfer, heat transfer, and optical imaging. Our DeepRTE framework demonstrates superior computational efficiency for solving the steady-state RTE, surpassing traditional methods and existing neural network approaches. This efficiency is achieved by embedding physical information through derivation of the RTE and mathematically-informed network architecture. Concurrently, DeepRTE achieves high accuracy with significantly fewer parameters, largely due to its incorporation of mechanisms such as multi-head attention. Furthermore, DeepRTE is a mesh-free neural operator framework with inherent zero-shot capability. This is achieved by incorporating Green's function theory and pre-training with delta-function inflow boundary conditions into both its architecture design and training data construction. The efficacy of the proposed approach is substantiated through comprehensive numerical experiments.

DeepRTE: Pre-trained Attention-based Neural Network for Radiative Transfer

TL;DR

DeepRTE introduces a physics-informed, attention-based neural operator to solve the steady-state radiative transfer equation (RTE) by learning a Green's-function–based operator. It decomposes the operator into attenuation (along characteristics) and scattering components, implemented as Attenuation and Scattering modules, and pretrains with delta-function inflow to enable zero-shot and transferable performance. Across three scattering regimes, DeepRTE achieves high accuracy with far fewer parameters than baselines and exhibits strong linearity and transfer capabilities, including zero-shot generalization to unseen boundary conditions. The approach delivers significant speedups over deterministic solvers while preserving physical interpretability, making it a scalable, mesh-free alternative for high-dimensional radiative transport problems.

Abstract

In this paper, we propose a novel neural network approach, termed DeepRTE, to address the steady-state Radiative Transfer Equation (RTE). The RTE is a differential-integral equation that governs the propagation of radiation through a participating medium, with applications spanning diverse domains such as neutron transport, atmospheric radiative transfer, heat transfer, and optical imaging. Our DeepRTE framework demonstrates superior computational efficiency for solving the steady-state RTE, surpassing traditional methods and existing neural network approaches. This efficiency is achieved by embedding physical information through derivation of the RTE and mathematically-informed network architecture. Concurrently, DeepRTE achieves high accuracy with significantly fewer parameters, largely due to its incorporation of mechanisms such as multi-head attention. Furthermore, DeepRTE is a mesh-free neural operator framework with inherent zero-shot capability. This is achieved by incorporating Green's function theory and pre-training with delta-function inflow boundary conditions into both its architecture design and training data construction. The efficacy of the proposed approach is substantiated through comprehensive numerical experiments.

Paper Structure

This paper contains 46 sections, 5 theorems, 134 equations, 12 figures, 14 tables, 7 algorithms.

Key Result

Theorem 2.1

See egger2014lp Assume the following conditions hold: Then, for all $1\leq p \leq\infty$ and all admissible data $I_{-}$ the radiative transfer problem eq:rte-with-bc admits a unique solution $I$ that satisfies Morever, if we further assume $\mu_t>0$ and for some $\nu>0$, then this allows to consider also the case $\mu_t\to\infty$ which may be important for asymptotic considerations.

Figures (12)

  • Figure 1: DeepRTE network architecture. The diagram illustrates the alterations in various inputs throughout the process. Attenuation Module processes phase coordinates and coefficients to approximate operators $\mathcal{J}$, $\mathcal{L}$, followed by Scattering Module handling angular quadrature evaluations for operator iterations. Ultimately the Green's function $G^{\text{NN}}$ is multiplied with the boundary conditions and integrated over $\Gamma_-$ to compute radiation field.
  • Figure 2: Attenuation module architecture. The network takes as input the spatial coordinates ($\bm{r},\bm{r}'$), angular variables ($\hat{\bm{\Omega}},\hat{\bm{\Omega}}'$), and cross sections ($\mu_t,\mu_s$). The OpticalDepthNet submodule first computes the optical depth $\tau_-^{\text{NN}}$ along the characteristic line. These features are then processed by an MLP to produce the truncated Green's function representation $\bm{G}^{\text{NN}}\in\mathbb{R}^{d_{\text{model}}}$, which samples the radiation field at discrete points $\bm{r}-s_i\hat{\bm{\Omega}}$ (circles along characteristic line). This architecture efficiently captures both local scattering effects and non-local attenuation while maintaining computational tractability through dimension reduction.
  • Figure 3: Mask and relative position embedding. Solid dots represent active grid nodes within the $\delta$-neighborhood of the characteristic line. These nodes provide spatial support for optical depth interpolation, thereby avoiding full-domain computation.
  • Figure 4: Scattering module. Stacked residual blocks with physics-informed $\mathcal{S}$-approximation layers, employing tanh-activated operator transforms and adaptive layer normalization for stabilized learning.
  • Figure 5: Scattering block. Representing simulating the result of the operator $\mathcal{S}$ acting once.Including weighted summation of integral, linear projection and activation.
  • ...and 7 more figures

Theorems & Definitions (12)

  • Theorem 2.1
  • Theorem 2.2
  • Theorem 2.3
  • Remark 2.1
  • Remark 2.2
  • Remark 3.1
  • Remark 3.2
  • Remark 3.3
  • Theorem 4.1
  • Theorem 4.2
  • ...and 2 more