Enforcing the Principle of Locality for Physical Simulations with Neural Operators

Jiangce Chen; Wenzhuo Xu; Zeda Xu; Noelia Grande Gutiérrez; Sneha Prabha Narra; Christopher McComb

Enforcing the Principle of Locality for Physical Simulations with Neural Operators

Jiangce Chen, Wenzhuo Xu, Zeda Xu, Noelia Grande Gutiérrez, Sneha Prabha Narra, Christopher McComb

TL;DR

This work identifies a fundamental incompatibility between deep neural operators and strict locality in time-dependent PDEs, showing that increasing network depth expands the effective local-dependency and can degrade learning when data are limited. It introduces DDELD, a data-decomposition approach that enforces local-dependency by operating on fixed-size windows and integrating predictions across multiple domain expansions, achieving linear-time complexity. Across mass transport, Burgers' equation, isotropic turbulence, and AM heat-transfer simulations, DDELD substantially accelerates training convergence and improves geometric generalization, particularly for operators with broader spatial reach. The method promises scalable, parallelizable neural-operator solutions for large-scale engineering simulations and offers avenues for extensions to unstructured data and multi-scale problems.

Abstract

Time-dependent partial differential equations (PDEs) for classic physical systems are established based on the conservation of mass, momentum, and energy, which are ubiquitous in scientific and engineering applications. These PDEs are strictly local-dependent according to the principle of locality in physics, which means that the evolution at a point is only influenced by the neighborhood around it whose size is determined by the length of timestep multiplied with the speed of characteristic information traveling in the system. However, deep learning architecture cannot strictly enforce the local-dependency as it inevitably increases the scope of information to make local predictions as the number of layers increases. Under limited training data, the extra irrelevant information results in sluggish convergence and compromised generalizability. This paper aims to solve this problem by proposing a data decomposition method to strictly limit the scope of information for neural operators making local predictions, which is called data decomposition enforcing local-dependency (DDELD). The numerical experiments over multiple physical phenomena show that DDELD significantly accelerates training convergence and reduces test errors of benchmark models on large-scale engineering simulations.

Enforcing the Principle of Locality for Physical Simulations with Neural Operators

TL;DR

Abstract

Paper Structure (33 sections, 4 theorems, 32 equations, 19 figures, 1 table, 3 algorithms)

This paper contains 33 sections, 4 theorems, 32 equations, 19 figures, 1 table, 3 algorithms.

Introduction
State-of-the-art methods
Unsolved problem
Contributions
Remarks
Background of Neural Operators
Neural operator formulation
Learning framework
Discretization
Incompatibility between Deep Learning and Local-dependency
Local-dependency
The more the layers, the weaker the local-dependency
Methodology
Method formulation
Domain decomposition
...and 18 more sections

Key Result

Theorem 1

Let $G_{\theta}: \mathcal{A}\times \mathcal{U} \rightarrow \mathcal{U}$ be a neural operator consisting of $k$ layers of local-dependent convolution defined in Equation equ:local_convolution where the interval of the convolution is the $U(x, \delta)$. While the local-dependent region of each convolu

Figures (19)

Figure 1: The target problem. (a) The deep learning architecture inevitably expands the scope of input data used for the prediction at one position as the number of layers increases, which is not compatible with the local-dependency assumption for a classical physics system. (b) DDELD method proposed in this paper can ensure that the scope of the input data stays constant regardless of the number of layers, which decouples the expressiveness and local dependency of neural networks.
Figure 2: The overview of the method. (a) To predict the physical property at the black position, its neighbors (colored in grey) contain sufficient information. The prediction is colored in orange. (b) One decomposition of the domain can be used to make the predictions over a part of the domain. (c) Multiple decompositions and prediction integration algorithms are needed to make the prediction over the whole domain.
Figure 3: The example of expanding the domain in two steps. (a) Given window size $(3,3)$, a 2D domain with size $(7,7)$ needs to be expanded to be multiple of the window size. (b) In the first step, the domain is expanded to $(9,9)$ by padding the zeros at the end of each dimension. (c) In the second step, the domain is expanded to $(10,10)$ to be compatible with the prediction integration algorithm by padding zeros at the beginning and the end of each dimension.
Figure 4: The illustration of the domain decomposition and window patching for one partition. (a). A data batch of global domain. In this example, the data in 2D space with $X^0$ denoting the batch dimension, $X^1$, and $X^2$ denoting the two domain dimensions. The batch size is set as 4 and the domain is set to be decomposed into $3\times 3$ blocks in this example. (b) The batch is split into three parts in $x^2$ dimension. (c) The parts are stacked in $X^0$ dimensions to make a new batch with 12 batch sizes and $1\times 3$ blocks. (d) The batch is split into three parts in $x^1$ dimension. (e) The parts are stacked in $x^0$ dimension. (f) The original data is decomposed into $3\times 3$ blocks which are stacked to make a new batch with 36 batch size. (g) The ML model predicts the physical properties at the centers of the windows. (h) In a reverse of the decomposition process (b) to (e), the data shape is recovered to the original shape with the predictions made at the centers of each window.
Figure 5: The illustration of the data integration algorithm. (a) A batch of 2D data is viewed from the top where $X^1$ and $X^2$ are the two domain dimensions and the batch size dimension $x^0$ is not shown in the top view. The domain represented by the grid is expanded by padding the zeros which is the blank space near its boundary. (b) Multiple partitions are made over the expanded domain. The predictions over the partitions are made independently and only the predictions in the centers of the blocks are preserved for the prediction integration, which is colored in orange. (c) The prediction over the whole domain is integrated.
...and 14 more figures

Theorems & Definitions (10)

Definition 1
Definition 2
Theorem 1
Theorem 2
Lemma 1
proof
proof
Lemma 2
Definition 3
proof

Enforcing the Principle of Locality for Physical Simulations with Neural Operators

TL;DR

Abstract

Enforcing the Principle of Locality for Physical Simulations with Neural Operators

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (19)

Theorems & Definitions (10)