Table of Contents
Fetching ...

Zero Coordinate Shift: Whetted Automatic Differentiation for Physics-informed Operator Learning

Kuangdai Leng, Mallikarjun Shankar, Jeyan Thiyagalingam

TL;DR

The trick of Zero Coordinate Shift (ZCS) is presented, a novel and lightweight algorithm to conduct AD for physics-informed operator learning, which has led to an outstanding performance leap by avoiding the duplication of the computational graph along the dimension of functions (physical parameters).

Abstract

Automatic differentiation (AD) is a critical step in physics-informed machine learning, required for computing the high-order derivatives of network output w.r.t. coordinates of collocation points. In this paper, we present a novel and lightweight algorithm to conduct AD for physics-informed operator learning, which we call the trick of Zero Coordinate Shift (ZCS). Instead of making all sampled coordinates as leaf variables, ZCS introduces only one scalar-valued leaf variable for each spatial or temporal dimension, simplifying the wanted derivatives from "many-roots-many-leaves" to "one-root-many-leaves" whereby reverse-mode AD becomes directly utilisable. It has led to an outstanding performance leap by avoiding the duplication of the computational graph along the dimension of functions (physical parameters). ZCS is easy to implement with current deep learning libraries; our own implementation is achieved by extending the DeepXDE package. We carry out a comprehensive benchmark analysis and several case studies, training physics-informed DeepONets to solve partial differential equations (PDEs) without data. The results show that ZCS has persistently reduced GPU memory consumption and wall time for training by an order of magnitude, and such reduction factor scales with the number of functions. As a low-level optimisation technique, ZCS imposes no restrictions on data, physics (PDE) or network architecture and does not compromise training results from any aspect.

Zero Coordinate Shift: Whetted Automatic Differentiation for Physics-informed Operator Learning

TL;DR

The trick of Zero Coordinate Shift (ZCS) is presented, a novel and lightweight algorithm to conduct AD for physics-informed operator learning, which has led to an outstanding performance leap by avoiding the duplication of the computational graph along the dimension of functions (physical parameters).

Abstract

Automatic differentiation (AD) is a critical step in physics-informed machine learning, required for computing the high-order derivatives of network output w.r.t. coordinates of collocation points. In this paper, we present a novel and lightweight algorithm to conduct AD for physics-informed operator learning, which we call the trick of Zero Coordinate Shift (ZCS). Instead of making all sampled coordinates as leaf variables, ZCS introduces only one scalar-valued leaf variable for each spatial or temporal dimension, simplifying the wanted derivatives from "many-roots-many-leaves" to "one-root-many-leaves" whereby reverse-mode AD becomes directly utilisable. It has led to an outstanding performance leap by avoiding the duplication of the computational graph along the dimension of functions (physical parameters). ZCS is easy to implement with current deep learning libraries; our own implementation is achieved by extending the DeepXDE package. We carry out a comprehensive benchmark analysis and several case studies, training physics-informed DeepONets to solve partial differential equations (PDEs) without data. The results show that ZCS has persistently reduced GPU memory consumption and wall time for training by an order of magnitude, and such reduction factor scales with the number of functions. As a low-level optimisation technique, ZCS imposes no restrictions on data, physics (PDE) or network architecture and does not compromise training results from any aspect.
Paper Structure (16 sections, 24 equations, 3 figures, 1 table, 1 algorithm)

This paper contains 16 sections, 24 equations, 3 figures, 1 table, 1 algorithm.

Figures (3)

  • Figure 1: Understanding ZCS via limits. In (a), $\frac{\partial f(x_1)}{\partial x_1}$ and $\frac{\partial f(x_2)}{\partial x_2}$ are approached individually by taking $\Delta x_1$ and $\Delta x_2$ as independent infinitesimal increments, corresponding to taking $x_1$ and $x_2$ as independent leaf variables for AD. In (b), $\Delta z$ is the only infinitesimal increment, associated with a zero-valued dummy variable $z$, and $\frac{\partial f(x_1)}{\partial x_1}$ and $\frac{\partial f(x_2)}{\partial x_2}$ are respectively equal to $\left.\frac{\partial f(x_1+z)}{\partial z}\right|_{z=0}$ and $\left.\frac{\partial f(x_2+z)}{\partial z}\right|_{z=0}$, meaning that $z$ can be the only leaf variable for AD.
  • Figure 2: Peak GPU memory and wall time for training DeepONets with different AD strategies. The PDE is given by eq. \ref{['eq:scaling']}, with a maximum differential order of $P$, and the function and point numbers $M$ and $N$ are defined in eq. \ref{['eq:pino']}. In the three columns from left to right, we vary respectively $M$, $N$ and $P$ while fixing the other two. The measurements are taken on a Nvidia-A100 GPU with 80 GB memory.
  • Figure 3: True and predicted solutions of the Stokes flow in a square box with a moving lid. The PDEs and boundary conditions are given by eq. \ref{['eq:stokes']}, with the source term $u_1(x)=x(1-x)$. The details of model training are given in Table \ref{['tab:pde']}. The true solution is computed using FreeFEM++ MR3043640.