Table of Contents
Fetching ...

Stochastic Dimension-Free Zeroth-Order Estimator for High-Dimensional and High-Order PINNs

Zhangyong Liang, Ji Zhang

Abstract

Physics-Informed Neural Networks (PINNs) for high-dimensional and high-order partial differential equations (PDEs) are primarily constrained by the $\mathcal{O}(d^k)$ spatial derivative complexity and the $\mathcal{O}(P)$ memory overhead of backpropagation (BP). While randomized spatial estimators successfully reduce the spatial complexity to $\mathcal{O}(1)$, their reliance on first-order optimization still leads to prohibitive memory consumption at scale. Zeroth-order (ZO) optimization offers a BP-free alternative; however, naively combining randomized spatial operators with ZO perturbations triggers a variance explosion of $\mathcal{O}(1/\varepsilon^2)$, leading to numerical divergence. To address these challenges, we propose the \textbf{S}tochastic \textbf{D}imension-free \textbf{Z}eroth-order \textbf{E}stimator (\textbf{SDZE}), a unified framework that achieves dimension-independent complexity in both space and memory. Specifically, SDZE leverages \emph{Common Random Numbers Synchronization (CRNS)} to algebraically cancel the $\mathcal{O}(1/\varepsilon^2)$ variance by locking spatial random seeds across perturbations. Furthermore, an \emph{implicit matrix-free subspace projection} is introduced to reduce parameter exploration variance from $\mathcal{O}(P)$ to $\mathcal{O}(r)$ while maintaining an $\mathcal{O}(1)$ optimizer memory footprint. Empirical results demonstrate that SDZE enables the training of 10-million-dimensional PINNs on a single NVIDIA A100 GPU, delivering significant improvements in speed and memory efficiency over state-of-the-art baselines.

Stochastic Dimension-Free Zeroth-Order Estimator for High-Dimensional and High-Order PINNs

Abstract

Physics-Informed Neural Networks (PINNs) for high-dimensional and high-order partial differential equations (PDEs) are primarily constrained by the spatial derivative complexity and the memory overhead of backpropagation (BP). While randomized spatial estimators successfully reduce the spatial complexity to , their reliance on first-order optimization still leads to prohibitive memory consumption at scale. Zeroth-order (ZO) optimization offers a BP-free alternative; however, naively combining randomized spatial operators with ZO perturbations triggers a variance explosion of , leading to numerical divergence. To address these challenges, we propose the \textbf{S}tochastic \textbf{D}imension-free \textbf{Z}eroth-order \textbf{E}stimator (\textbf{SDZE}), a unified framework that achieves dimension-independent complexity in both space and memory. Specifically, SDZE leverages \emph{Common Random Numbers Synchronization (CRNS)} to algebraically cancel the variance by locking spatial random seeds across perturbations. Furthermore, an \emph{implicit matrix-free subspace projection} is introduced to reduce parameter exploration variance from to while maintaining an optimizer memory footprint. Empirical results demonstrate that SDZE enables the training of 10-million-dimensional PINNs on a single NVIDIA A100 GPU, delivering significant improvements in speed and memory efficiency over state-of-the-art baselines.
Paper Structure (44 sections, 8 theorems, 76 equations, 1 figure, 4 tables)

This paper contains 44 sections, 8 theorems, 76 equations, 1 figure, 4 tables.

Key Result

Theorem 1

The latent exact spatial gradients $g_I(\boldsymbol{\theta})$ and $g_{I,J}(\boldsymbol{\theta})$ parameterized by index sets $I, J$, are unbiased estimators of the full-batch spatial gradient $g(\boldsymbol{\theta})$ using all PDE terms, i.e., the expected values of these latent targets match that o

Figures (1)

  • Figure 1: Scalability bottlenecks of existing first-order high-dimensional PDE solvers (STDE, SDGD, HTE, FOBAD) as problem dimension $d$ grows from $5\times10^2$ to $10^6$. (a) GPU peak memory (MB) grows rapidly with dimension, leading to catastrophic OOM failures at extreme scales. (b) Wall-clock time (s) likewise increases sharply, revealing severe computational overhead.

Theorems & Definitions (21)

  • Theorem 1
  • proof
  • Theorem 2
  • proof
  • Definition 1
  • Remark
  • Lemma
  • proof
  • Definition 2
  • Definition 3
  • ...and 11 more