Table of Contents
Fetching ...

Control-free and efficient integrated photonic neural networks via hardware-aware training and pruning

Tengji Xu, Weipeng Zhang, Jiawei Zhang, Zeyu Luo, Qiarong Xiao, Benshan Wang, Mingcheng Luo, Xingyuan Xu, Bhavin J. Shastri, Paul R. Prucnal, Chaoran Huang

TL;DR

This work proposes a novel hardware-aware training and pruning approach to train the parameters of a physical neural network towards its noise-robust and energy-efficient region, which significantly enhances the computing precision of MRR-based PNN, achieving a notable 4-bit improvement.

Abstract

Integrated photonic neural networks (PNNs) are at the forefront of AI computing, leveraging on light's unique properties, such as large bandwidth, low latency, and potentially low power consumption. Nevertheless, the integrated optical components within PNNs are inherently sensitive to external disturbances and thermal interference, which can detrimentally affect computing accuracy and reliability. Current solutions often use complicated control methods, resulting in high hardware complexity impractical for large-scale PNNs. In response, we propose a novel hardware-aware training and pruning approach. The core idea is to train the parameters of a physical neural network towards its noise-robust and energy-efficient region. This innovation enables control-free and energy-efficient photonic computing. Our method is validated across diverse integrated PNN architectures. Through experimental validation, our approach significantly enhances the computing precision of MRR-based PNN, achieving a notable 4-bit improvement without the need for complex device control mechanisms or energy-intensive temperature stabilization circuits. Specifically, it improves the accuracy of experimental handwritten digit classification from 67.0% to 95.0%, nearing theoretical limits and achieved without a thermoelectric controller. Additionally, this approach reduces the energy by tenfold. We further extend the validation to various architectures, such as PCM-based PNN, demonstrating the broad applicability of our approach across different platforms. This advancement represents a significant step towards the practical, energy-efficient, and noise-resilient implementation of large-scale integrated PNNs.

Control-free and efficient integrated photonic neural networks via hardware-aware training and pruning

TL;DR

This work proposes a novel hardware-aware training and pruning approach to train the parameters of a physical neural network towards its noise-robust and energy-efficient region, which significantly enhances the computing precision of MRR-based PNN, achieving a notable 4-bit improvement.

Abstract

Integrated photonic neural networks (PNNs) are at the forefront of AI computing, leveraging on light's unique properties, such as large bandwidth, low latency, and potentially low power consumption. Nevertheless, the integrated optical components within PNNs are inherently sensitive to external disturbances and thermal interference, which can detrimentally affect computing accuracy and reliability. Current solutions often use complicated control methods, resulting in high hardware complexity impractical for large-scale PNNs. In response, we propose a novel hardware-aware training and pruning approach. The core idea is to train the parameters of a physical neural network towards its noise-robust and energy-efficient region. This innovation enables control-free and energy-efficient photonic computing. Our method is validated across diverse integrated PNN architectures. Through experimental validation, our approach significantly enhances the computing precision of MRR-based PNN, achieving a notable 4-bit improvement without the need for complex device control mechanisms or energy-intensive temperature stabilization circuits. Specifically, it improves the accuracy of experimental handwritten digit classification from 67.0% to 95.0%, nearing theoretical limits and achieved without a thermoelectric controller. Additionally, this approach reduces the energy by tenfold. We further extend the validation to various architectures, such as PCM-based PNN, demonstrating the broad applicability of our approach across different platforms. This advancement represents a significant step towards the practical, energy-efficient, and noise-resilient implementation of large-scale integrated PNNs.
Paper Structure (11 sections, 5 equations, 6 figures, 1 table)

This paper contains 11 sections, 5 equations, 6 figures, 1 table.

Figures (6)

  • Figure 1: Illustration of conducting NN inference on MRR weight bank, MRR operation error performance, and ‘MRR pruning’ optimization method. (a-i) Transform the convolutional layer and fully connected layer into matrix multiplication. (a-ii) Implement matrix multiplication on MRR weight bank. (b-i) Single MRR spectrum change through thermal tuning. (b-ii) Tuning power and weight error change with weight value. (b-iii) Schematic of MRR signal transmission at weight 0 and weight 1. (b-iv) Signal-to-noise ratio change with the weight value. (c-i) Weight value distribution contrast histogram. (c-ii) Tuning power distribution contrast histogram. (c-iii) Weight error distribution contrast histogram. NN: Neural network. MRR: Micro-ring resonator. SOI: Silicon on insulator. BPD: Balanced photodetector. THRU: Through. SNR: Signal-to-noise ratio. PNN: Photonic neural network.
  • Figure 2: (a-i) 4$\times$1 MRR weight bank perform dot product. Different modulated wavelength lights are combined by WDM, weighted by independent MRRs, and summed by BPD. (a-ii) MRR weight bank spectrum. (a-iii) Experiment measured tuning curve. (b-i) Weight distribution changes during optimization. (b-ii) Weight error distribution without the MRR pruning method. (b-iii) Weight error distribution with MRR pruning method. (c) Neighbor MRR weight error relation. WDM: Wavelength division multiplexer. MRR: Micro-ring resonator. GND: Ground. THRU: Through. BPD: Balanced photodetector. Std: Standard deviation. TEC: Thermoelectric controller.
  • Figure 3: Experiment inference confusion matrices under experiment error. (a-i) Two-layer CNN 'MNIST' classification confusion matrix without pruning. (a-ii) LeNet-5 'MNIST' classification confusion matrix without pruning. (a-iii) ResNet-18 'CIFAR-10' classification confusion matrix without pruning. (b-i) Two-layer CNN 'MNIST' classification confusion matrix with pruning. (b-ii) LeNet-5 'MNIST' classification confusion matrix with pruning. (b-iii) ResNet-18 'CIFAR-10' classification confusion matrix with pruning. MRR: Micro-ring resonator. TEC: Thermoelectric controller.
  • Figure 4: Average MRR tuning power across different size neural networks. MRR: Micro-ring resonator. CNN: Convolutional neural network.
  • Figure 5: Application to other integrated PNNs. (a-i) Diagram of Single-end detection MRR array. (a-ii) MRR Through port transmission tuning curve. (a-iii) Absolute weight distribution contrast, estimated wavelength drift at 50 pm. (b-i) Diagram of Crossbar MRR array. (b-ii) MRR Drop port transmission tuning curve. (b-iii) Absolute weight distribution contrast, estimated wavelength drift at 150 pm. MRR: Micro-ring resonator. PD: Photodetector. BPD: Balanced photodetector. SOI: Silicon on insulator.
  • ...and 1 more figures