Table of Contents
Fetching ...

Separable Operator Networks

Xinling Yu, Sean Hooten, Ziyue Liu, Yequan Zhao, Marco Fiorentino, Thomas Van Vaerenbergh, Zheng Zhang

TL;DR

Separable Operator Networks (SepONet) address the data- and compute-intensive training of physics-informed operator learning by factorizing the basis functions along coordinate axes using independent trunk nets. The approach yields a universal approximation guarantee for nonlinear continuous operators and achieves substantial speedups and memory savings over PI-DeepONet, especially as problem complexity, dimensionality, or scale increases. The framework leverages forward-mode automatic differentiation to efficiently compute spatiotemporal derivatives and supports extreme-scale learning for various time-dependent PDEs, with open-source code available for replication. This work thus enables more scalable, accurate operator learning across infinite-dimensional function spaces and paves the way for applying such techniques to complex systems like Navier–Stokes.

Abstract

Operator learning has become a powerful tool in machine learning for modeling complex physical systems governed by partial differential equations (PDEs). Although Deep Operator Networks (DeepONet) show promise, they require extensive data acquisition. Physics-informed DeepONets (PI-DeepONet) mitigate data scarcity but suffer from inefficient training processes. We introduce Separable Operator Networks (SepONet), a novel framework that significantly enhances the efficiency of physics-informed operator learning. SepONet uses independent trunk networks to learn basis functions separately for different coordinate axes, enabling faster and more memory-efficient training via forward-mode automatic differentiation. We provide a universal approximation theorem for SepONet proving the existence of a separable approximation to any nonlinear continuous operator. Then, we comprehensively benchmark its representational capacity and computational performance against PI-DeepONet. Our results demonstrate SepONet's superior performance across various nonlinear and inseparable PDEs, with SepONet's advantages increasing with problem complexity, dimension, and scale. For 1D time-dependent PDEs, SepONet achieves up to 112x faster training and 82x reduction in GPU memory usage compared to PI-DeepONet, while maintaining comparable accuracy. For the 2D time-dependent nonlinear diffusion equation, SepONet efficiently handles the complexity, achieving a 6.44% mean relative $\ell_{2}$ test error, while PI-DeepONet fails due to memory constraints. This work paves the way for extreme-scale learning of continuous mappings between infinite-dimensional function spaces. Open source code is available at \url{https://github.com/HewlettPackard/separable-operator-networks}.

Separable Operator Networks

TL;DR

Separable Operator Networks (SepONet) address the data- and compute-intensive training of physics-informed operator learning by factorizing the basis functions along coordinate axes using independent trunk nets. The approach yields a universal approximation guarantee for nonlinear continuous operators and achieves substantial speedups and memory savings over PI-DeepONet, especially as problem complexity, dimensionality, or scale increases. The framework leverages forward-mode automatic differentiation to efficiently compute spatiotemporal derivatives and supports extreme-scale learning for various time-dependent PDEs, with open-source code available for replication. This work thus enables more scalable, accurate operator learning across infinite-dimensional function spaces and paves the way for applying such techniques to complex systems like Navier–Stokes.

Abstract

Operator learning has become a powerful tool in machine learning for modeling complex physical systems governed by partial differential equations (PDEs). Although Deep Operator Networks (DeepONet) show promise, they require extensive data acquisition. Physics-informed DeepONets (PI-DeepONet) mitigate data scarcity but suffer from inefficient training processes. We introduce Separable Operator Networks (SepONet), a novel framework that significantly enhances the efficiency of physics-informed operator learning. SepONet uses independent trunk networks to learn basis functions separately for different coordinate axes, enabling faster and more memory-efficient training via forward-mode automatic differentiation. We provide a universal approximation theorem for SepONet proving the existence of a separable approximation to any nonlinear continuous operator. Then, we comprehensively benchmark its representational capacity and computational performance against PI-DeepONet. Our results demonstrate SepONet's superior performance across various nonlinear and inseparable PDEs, with SepONet's advantages increasing with problem complexity, dimension, and scale. For 1D time-dependent PDEs, SepONet achieves up to 112x faster training and 82x reduction in GPU memory usage compared to PI-DeepONet, while maintaining comparable accuracy. For the 2D time-dependent nonlinear diffusion equation, SepONet efficiently handles the complexity, achieving a 6.44% mean relative test error, while PI-DeepONet fails due to memory constraints. This work paves the way for extreme-scale learning of continuous mappings between infinite-dimensional function spaces. Open source code is available at \url{https://github.com/HewlettPackard/separable-operator-networks}.
Paper Structure (48 sections, 9 theorems, 61 equations, 17 figures, 7 tables)

This paper contains 48 sections, 9 theorems, 61 equations, 17 figures, 7 tables.

Key Result

Theorem 1

Suppose that $\sigma$ is a Tauber-Wiener function, $g$ is a sinusoidal function, $\mathcal{X}$ is a Banach space, $K \subseteq \mathcal{X}$, $K_1 \subseteq \mathbb{R}^{d_{1}}$ and $K_2 \subseteq \mathbb{R}^{d_{2}}$ are three compact sets in $\mathcal{X}$, $\mathbb{R}^{d_{1}}$ and $\mathbb{R}^{d_{2}} holds for all $u \in \mathcal{U}$, $y = (y_{1}, y_{2}) \in K_1 \times K_2$.

Figures (17)

  • Figure 1: Separable operator network (SepONet) architecture for 2D problem instance. A coordinate grid of collocation points $(x^{(i)}, y^{(j)})$ can be evaluated efficiently by separating the coordinate axes, feeding them through independent trunk networks, and combining the outputs by outer product to obtain multiple basis function maps. Meanwhile, the branch network processes input functions and outputs coefficients, which are then used to scale and combine the trunk network basis functions by product and sum. Spatiotemporal derivatives of the output predictions are obtained efficiently by forward-mode automatic differentiation due to the independence of trunk networks along each coordinate axis.
  • Figure 2: Performance comparison of PI-DeepONet and SepONet with varying number of training points ($N_c$) and fixed number of input functions ($N_{f} = 100$). Results show test accuracy, GPU memory usage, and training time for four PDEs. As $N_c$ increases, both models demonstrate improved accuracy, but PI-DeepONet exhibits significant increases in training time and memory usage, while SepONet maintains better computational efficiency.
  • Figure 3: Performance comparison of PI-DeepONet and SepONet with increasing number of input functions ($N_f$) and fixed number of training points ($N_{c} = 128^{d}$, where $d$ is the problem dimension). Both models show improved accuracy with increasing $N_f$, but PI-DeepONet's computational resources scale poorly compared to SepONet's more efficient scaling. Note: PI-DeepONet results for the (2+1)-dimensional diffusion equation are unavailable due to memory constraints.
  • Figure 4: Performance comparison of PI-DeepONet and SepONet with TanH trunk network activation functions, varying number of training points ($N_c$) and fixed number of input functions ($N_{f} = 100$). Results show test accuracy, GPU memory usage, and training time for four PDEs.
  • Figure 5: Performance comparison of PI-DeepONet and SepONet with TanH trunk network activation functions, increasing number of input functions ($N_f$) and fixed number of training points ($N_{c} = 128^{d}$, where $d$ is the problem dimension). Note: PI-DeepONet results for the (2+1)-dimensional diffusion equation are unavailable due to memory constraints.
  • ...and 12 more figures

Theorems & Definitions (25)

  • Theorem 1: Universal Approximation Theorem for Separable Operator Networks
  • proof
  • Remark 1
  • Remark 2
  • Remark 3
  • Remark 4
  • Remark 5
  • Definition 1: Tauber-Wiener (TW)
  • Remark 1: Density in $C[a, b]$
  • Definition 2: Compact Set
  • ...and 15 more