Table of Contents
Fetching ...

Scalable Back-Propagation-Free Training of Optical Physics-Informed Neural Networks

Yequan Zhao, Xinling Yu, Xian Xiao, Zhixiong Chen, Ziyue Liu, Geza Kurczveil, Raymond G. Beausoleil, Sijia Liu, Zheng Zhang

TL;DR

The paper addresses the challenge of training physics-informed neural networks (PINNs) on edge photonic hardware by introducing a completely back-propagation-free framework. It combines a sparse-grid Stein derivative estimator for loss evaluation with tensor-train (TT) based zeroth-order optimization to update parameters, enabling scalable training for PINNs with hundreds of neurons per layer. A scalable on-chip photonic accelerator (TONN) with space- and time-multiplexed designs is proposed to support real-size PINNs, dramatically reducing device counts and enabling real-time training in simulations. Across multiple PDE benchmarks and hardware simulations, the approach achieves competitive accuracy and substantial hardware efficiency, highlighting the practicality and potential of BP-free training for edge photonic computing and beyond.

Abstract

Physics-informed neural networks (PINNs) have shown promise in solving partial differential equations (PDEs), with growing interest in their energy-efficient, real-time training on edge devices. Photonic computing offers a potential solution to achieve this goal because of its ultra-high operation speed. However, the lack of photonic memory and the large device sizes prevent training real-size PINNs on photonic chips. This paper proposes a completely back-propagation-free (BP-free) and highly salable framework for training real-size PINNs on silicon photonic platforms. Our approach involves three key innovations: (1) a sparse-grid Stein derivative estimator to avoid the BP in the loss evaluation of a PINN, (2) a dimension-reduced zeroth-order optimization via tensor-train decomposition to achieve better scalability and convergence in BP-free training, and (3) a scalable on-chip photonic PINN training accelerator design using photonic tensor cores. We validate our numerical methods on both low- and high-dimensional PDE benchmarks. Through circuit simulation based on real device parameters, we further demonstrate the significant performance benefit (e.g., real-time training, huge chip area reduction) of our photonic accelerator.

Scalable Back-Propagation-Free Training of Optical Physics-Informed Neural Networks

TL;DR

The paper addresses the challenge of training physics-informed neural networks (PINNs) on edge photonic hardware by introducing a completely back-propagation-free framework. It combines a sparse-grid Stein derivative estimator for loss evaluation with tensor-train (TT) based zeroth-order optimization to update parameters, enabling scalable training for PINNs with hundreds of neurons per layer. A scalable on-chip photonic accelerator (TONN) with space- and time-multiplexed designs is proposed to support real-size PINNs, dramatically reducing device counts and enabling real-time training in simulations. Across multiple PDE benchmarks and hardware simulations, the approach achieves competitive accuracy and substantial hardware efficiency, highlighting the practicality and potential of BP-free training for edge photonic computing and beyond.

Abstract

Physics-informed neural networks (PINNs) have shown promise in solving partial differential equations (PDEs), with growing interest in their energy-efficient, real-time training on edge devices. Photonic computing offers a potential solution to achieve this goal because of its ultra-high operation speed. However, the lack of photonic memory and the large device sizes prevent training real-size PINNs on photonic chips. This paper proposes a completely back-propagation-free (BP-free) and highly salable framework for training real-size PINNs on silicon photonic platforms. Our approach involves three key innovations: (1) a sparse-grid Stein derivative estimator to avoid the BP in the loss evaluation of a PINN, (2) a dimension-reduced zeroth-order optimization via tensor-train decomposition to achieve better scalability and convergence in BP-free training, and (3) a scalable on-chip photonic PINN training accelerator design using photonic tensor cores. We validate our numerical methods on both low- and high-dimensional PDE benchmarks. Through circuit simulation based on real device parameters, we further demonstrate the significant performance benefit (e.g., real-time training, huge chip area reduction) of our photonic accelerator.

Paper Structure

This paper contains 47 sections, 27 equations, 9 figures, 14 tables.

Figures (9)

  • Figure 1: Tensor-train decomposition: matrix $\mathbf{W}$ is folded to a multi-way tensor $\mathbfcal{W}$ and decomposed into $L$ small TT cores $\{\mathbfcal{G}_k\}_{k=1}^L$.
  • Figure 2: (a) The overall architecture of the BP-free optical training accelerator. (b) TONN space multiplexing (TONN-SM) architecture. (c) TONN time multiplexing (TONN-TM) architecture.
  • Figure 3: Training efficiency comparison of ZO training methods.
  • Figure 4: The first two subfigures show the relative $\ell_2$ error of Black-Scholes and 20-dim HJB equations learned by different ONN training methods. The last two subfigures show the ground truth $u(x)$, and the learned solution $\hat{u}(x)$ using our proposed method.
  • Figure 5: (The same as Figure \ref{['fig:TONN']} (c)) TONN-SM architecture. PTC: photonic tensor core, DAC: digital-analog converter, ADC: analog-digital converter.
  • ...and 4 more figures