Table of Contents
Fetching ...

Online training and pruning of multi-wavelength photonic neural networks

Jiawei Zhang, Weipeng Zhang, Tengji Xu, Lei Xu, Eli A. Doris, Bhavin J. Shastri, Chaoran Huang, Paul R. Prucnal

TL;DR

The paper tackles resonance variations from fabrication and environmental fluctuations that limit the scalability and energy efficiency of microring resonator (MRR)-based photonic neural networks (PNNs). It introduces an online, chip-in-the-loop training framework coupled with a power-aware pruning term, yielding a modified loss $\tilde{\mathcal{L}} = \mathcal{L} + \gamma \mathbf{P}$ to jointly optimize accuracy and MRR tuning power without reliance on LUTs. Empirical validation on a 3×2 PNN using the Iris dataset shows 96% accuracy with a 44.7% reduction in tuning power, and simulations indicate orders-of-magnitude energy savings for larger networks. This approach enhances the scalability of CMOS-compatible PICs for large-scale neural processing and related photonic applications, enabling more energy-efficient, adaptable photonic accelerators.

Abstract

CMOS-compatible photonic integrated circuits (PICs) are emerging as a promising platform in artificial intelligence (AI) computing. Owing to the compact footprint of microring resonators (MRRs) and the enhanced interconnect efficiency enabled by wavelength division multiplexing (WDM), MRR-based photonic neural networks (PNNs) are particularly promising for large-scale integration. However, the scalability and energy efficiency of such systems are fundamentally limited by the MRR resonance wavelength variations induced by fabrication process variations (FPVs) and environmental fluctuations. Existing solutions use post-fabrication approaches or thermo-optic tuning, incurring high control power and additional process complexity. In this work, we introduce an online training and pruning method that addresses this challenge, adapting to FPV-induced and thermally induced shifts in MRR resonance wavelength. By incorporating a power-aware pruning term into the conventional loss function, our approach simultaneously optimizes the PNN accuracy and the total power consumption for MRR tuning. In proof-of-concept on-chip experiments on the Iris dataset, our system PNNs can adaptively train to maintain a 96% classification accuracy, while achieving a 44.7% reduction in tuning power via pruning. Additionally, our approach reduces the power consumption by orders-of-magnitude on larger datasets. By addressing chip-to-chip variation and minimizing power requirements, our approach significantly improves the scalability and energy efficiency of MRR-based integrated analog photonic processors, paving the way for large-scale PICs to enable versatile applications including neural networks, photonic switching, LiDAR, and radio-frequency beamforming.

Online training and pruning of multi-wavelength photonic neural networks

TL;DR

The paper tackles resonance variations from fabrication and environmental fluctuations that limit the scalability and energy efficiency of microring resonator (MRR)-based photonic neural networks (PNNs). It introduces an online, chip-in-the-loop training framework coupled with a power-aware pruning term, yielding a modified loss to jointly optimize accuracy and MRR tuning power without reliance on LUTs. Empirical validation on a 3×2 PNN using the Iris dataset shows 96% accuracy with a 44.7% reduction in tuning power, and simulations indicate orders-of-magnitude energy savings for larger networks. This approach enhances the scalability of CMOS-compatible PICs for large-scale neural processing and related photonic applications, enabling more energy-efficient, adaptable photonic accelerators.

Abstract

CMOS-compatible photonic integrated circuits (PICs) are emerging as a promising platform in artificial intelligence (AI) computing. Owing to the compact footprint of microring resonators (MRRs) and the enhanced interconnect efficiency enabled by wavelength division multiplexing (WDM), MRR-based photonic neural networks (PNNs) are particularly promising for large-scale integration. However, the scalability and energy efficiency of such systems are fundamentally limited by the MRR resonance wavelength variations induced by fabrication process variations (FPVs) and environmental fluctuations. Existing solutions use post-fabrication approaches or thermo-optic tuning, incurring high control power and additional process complexity. In this work, we introduce an online training and pruning method that addresses this challenge, adapting to FPV-induced and thermally induced shifts in MRR resonance wavelength. By incorporating a power-aware pruning term into the conventional loss function, our approach simultaneously optimizes the PNN accuracy and the total power consumption for MRR tuning. In proof-of-concept on-chip experiments on the Iris dataset, our system PNNs can adaptively train to maintain a 96% classification accuracy, while achieving a 44.7% reduction in tuning power via pruning. Additionally, our approach reduces the power consumption by orders-of-magnitude on larger datasets. By addressing chip-to-chip variation and minimizing power requirements, our approach significantly improves the scalability and energy efficiency of MRR-based integrated analog photonic processors, paving the way for large-scale PICs to enable versatile applications including neural networks, photonic switching, LiDAR, and radio-frequency beamforming.

Paper Structure

This paper contains 12 sections, 16 equations, 5 figures.

Figures (5)

  • Figure 1: a Conventional offline training method supporting only small-scale PNN chips. In this approach, NN parameters are calculated in software and mapped onto the PNN chips. The resonance variations of MRRs necessitate complicated look-up tables, and lead to higher power consumption for MRR control. b Online training and pruning method compatible for large-scale PNN chips. The training of PNNs occurs on the same chip used for inference, accounting for any chip-to-chip variations and environmental fluctuations. Our approach simultaneously optimizes the PNN accuracy and the total power consumption for tuning all the MRRs.
  • Figure 2: a Schematic of our experimental setup. MZM, Mach-Zehnder Modulator. MUX, wavelength multiplexer. PIC, photonic integrated circuit. MRR, microring resonator. BPD, balanced photodetector. ADC, analog-to-digital converter. CPU, central processing unit. $\sigma(z)$ denotes the nonlinear activation function performed on software. The inset shows the micrograph of the two MRR weight banks and BPDs. b Normalized weight spectrum of two MRR weight banks at 20$^\circ$C. The MRRs at the same colors (i.e., MRR1 and MRR4, MRR2 and MRR5, MRR3 and MRR6) are designed to share the same resonance wavelengths, aligned with 200 GHz spaced ITU grids (1546.92 nm, 1548.51 nm, 1550.12 nm). c Tuning characteristics of six MRRs at 20$^\circ$C. d Scatterplot of Iris flower dataset. e Schematic of a simple 3$\times$2 neural network for Iris classification. f Online training and pruning procedure. At each iteration, the PIC performs matrix-vector multiplication without and with perturbations, while the CPU evaluates the gradients of power-aware loss function, and updates the MRR tuning currents for the next iteration.
  • Figure 3: a Experimental results of online training losses without, or with the pruning method. b Simulated result indicating the tradeoff between the prediction accuracy and power efficiency. c–e Confusion matrices for the 150 samples, obtained by the conventional offline training, online training without, or with the pruning method, respectively.
  • Figure 4: Adaptive training at 20$^{\circ}$C (the upper plots) and 25$^{\circ}$C (the lower plots). a, b illustrate two of the six MRR tuning currents (weights $\textbf{I}_{11}$ and $\textbf{I}_{21}$) before and after online training, respectively.
  • Figure 5: Simulated results on reducing overall power consumption enabled by online pruning across different size MRR-based PNNs.