DeepRTE: Pre-trained Attention-based Neural Network for Radiative Transfer
Yekun Zhu, Min Tang, Zheng Ma
TL;DR
DeepRTE introduces a physics-informed, attention-based neural operator to solve the steady-state radiative transfer equation (RTE) by learning a Green's-function–based operator. It decomposes the operator into attenuation (along characteristics) and scattering components, implemented as Attenuation and Scattering modules, and pretrains with delta-function inflow to enable zero-shot and transferable performance. Across three scattering regimes, DeepRTE achieves high accuracy with far fewer parameters than baselines and exhibits strong linearity and transfer capabilities, including zero-shot generalization to unseen boundary conditions. The approach delivers significant speedups over deterministic solvers while preserving physical interpretability, making it a scalable, mesh-free alternative for high-dimensional radiative transport problems.
Abstract
In this paper, we propose a novel neural network approach, termed DeepRTE, to address the steady-state Radiative Transfer Equation (RTE). The RTE is a differential-integral equation that governs the propagation of radiation through a participating medium, with applications spanning diverse domains such as neutron transport, atmospheric radiative transfer, heat transfer, and optical imaging. Our DeepRTE framework demonstrates superior computational efficiency for solving the steady-state RTE, surpassing traditional methods and existing neural network approaches. This efficiency is achieved by embedding physical information through derivation of the RTE and mathematically-informed network architecture. Concurrently, DeepRTE achieves high accuracy with significantly fewer parameters, largely due to its incorporation of mechanisms such as multi-head attention. Furthermore, DeepRTE is a mesh-free neural operator framework with inherent zero-shot capability. This is achieved by incorporating Green's function theory and pre-training with delta-function inflow boundary conditions into both its architecture design and training data construction. The efficacy of the proposed approach is substantiated through comprehensive numerical experiments.
