Stochastic Dimension-Free Zeroth-Order Estimator for High-Dimensional and High-Order PINNs

Zhangyong Liang; Ji Zhang

Stochastic Dimension-Free Zeroth-Order Estimator for High-Dimensional and High-Order PINNs

Zhangyong Liang, Ji Zhang

Abstract

Physics-Informed Neural Networks (PINNs) for high-dimensional and high-order partial differential equations (PDEs) are primarily constrained by the $\mathcal{O}(d^k)$ spatial derivative complexity and the $\mathcal{O}(P)$ memory overhead of backpropagation (BP). While randomized spatial estimators successfully reduce the spatial complexity to $\mathcal{O}(1)$, their reliance on first-order optimization still leads to prohibitive memory consumption at scale. Zeroth-order (ZO) optimization offers a BP-free alternative; however, naively combining randomized spatial operators with ZO perturbations triggers a variance explosion of $\mathcal{O}(1/\varepsilon^2)$, leading to numerical divergence. To address these challenges, we propose the \textbf{S}tochastic \textbf{D}imension-free \textbf{Z}eroth-order \textbf{E}stimator (\textbf{SDZE}), a unified framework that achieves dimension-independent complexity in both space and memory. Specifically, SDZE leverages \emph{Common Random Numbers Synchronization (CRNS)} to algebraically cancel the $\mathcal{O}(1/\varepsilon^2)$ variance by locking spatial random seeds across perturbations. Furthermore, an \emph{implicit matrix-free subspace projection} is introduced to reduce parameter exploration variance from $\mathcal{O}(P)$ to $\mathcal{O}(r)$ while maintaining an $\mathcal{O}(1)$ optimizer memory footprint. Empirical results demonstrate that SDZE enables the training of 10-million-dimensional PINNs on a single NVIDIA A100 GPU, delivering significant improvements in speed and memory efficiency over state-of-the-art baselines.

Stochastic Dimension-Free Zeroth-Order Estimator for High-Dimensional and High-Order PINNs

Abstract

Physics-Informed Neural Networks (PINNs) for high-dimensional and high-order partial differential equations (PDEs) are primarily constrained by the

spatial derivative complexity and the

memory overhead of backpropagation (BP). While randomized spatial estimators successfully reduce the spatial complexity to

, their reliance on first-order optimization still leads to prohibitive memory consumption at scale. Zeroth-order (ZO) optimization offers a BP-free alternative; however, naively combining randomized spatial operators with ZO perturbations triggers a variance explosion of

, leading to numerical divergence. To address these challenges, we propose the \textbf{S}tochastic \textbf{D}imension-free \textbf{Z}eroth-order \textbf{E}stimator (\textbf{SDZE}), a unified framework that achieves dimension-independent complexity in both space and memory. Specifically, SDZE leverages \emph{Common Random Numbers Synchronization (CRNS)} to algebraically cancel the

variance by locking spatial random seeds across perturbations. Furthermore, an \emph{implicit matrix-free subspace projection} is introduced to reduce parameter exploration variance from

while maintaining an

optimizer memory footprint. Empirical results demonstrate that SDZE enables the training of 10-million-dimensional PINNs on a single NVIDIA A100 GPU, delivering significant improvements in speed and memory efficiency over state-of-the-art baselines.

Paper Structure (44 sections, 8 theorems, 76 equations, 1 figure, 4 tables)

This paper contains 44 sections, 8 theorems, 76 equations, 1 figure, 4 tables.

Introduction
Related works
High-order and forward mode AD
Randomized Gradient Estimation
Zeroth-order Gradient Estimation
Preliminaries
Notations
First-order gradient estimation
Stochastic dimension gradient descent
Zeroth-order gradient estimation
Method
Phase I: Rigorous Formulation of Spatial Operator Amortization
Phase II: Kronecker-Isomorphic Subspace Projection and Lazy Updates
Phase III: The Doubly Stochastic Deadlock and CRNS Mechanism
Phase IV: Implicit Associative Forward Pass for Deep Networks
...and 29 more sections

Key Result

Theorem 1

The latent exact spatial gradients $g_I(\boldsymbol{\theta})$ and $g_{I,J}(\boldsymbol{\theta})$ parameterized by index sets $I, J$, are unbiased estimators of the full-batch spatial gradient $g(\boldsymbol{\theta})$ using all PDE terms, i.e., the expected values of these latent targets match that o

Figures (1)

Figure 1: Scalability bottlenecks of existing first-order high-dimensional PDE solvers (STDE, SDGD, HTE, FOBAD) as problem dimension $d$ grows from $5\times10^2$ to $10^6$. (a) GPU peak memory (MB) grows rapidly with dimension, leading to catastrophic OOM failures at extreme scales. (b) Wall-clock time (s) likewise increases sharply, revealing severe computational overhead.

Theorems & Definitions (21)

Theorem 1
proof
Theorem 2
proof
Definition 1
Remark
Lemma
proof
Definition 2
Definition 3
...and 11 more

Stochastic Dimension-Free Zeroth-Order Estimator for High-Dimensional and High-Order PINNs

Abstract

Stochastic Dimension-Free Zeroth-Order Estimator for High-Dimensional and High-Order PINNs

Authors

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (21)