Table of Contents
Fetching ...

Reuse and Blend: Energy-Efficient Optical Neural Network Enabled by Weight Sharing

Bo Xu, Yuetong Fang, Shaoliang Yu, Renjing Xu

TL;DR

This work tackles the energy and latency bottlenecks of large-weight optical neural networks built from micro-ring resonators by introducing a weight-sharing paradigm. The authors propose a Reuse and Blend (R&B) architecture comprising Photonic Reuse Method (PRM) for layer-wise and block-wise weight reuse and an Opto-electronic Blend Unit (OBU) for optical transpose and electronic shuffle, enabling a single MRR crossbar to represent multiple weight matrices. Hardware-aware design details include handling full-range weights with an offset matrix, ReLU activation in the electrical domain, efficient normalization, and in-OM optical/shuffle operations, all implemented with minimal overhead. Empirical results across MLP, VGG-13, ResNet-18, and MLP-Mixer on MNIST/CIFAR datasets show up to $69\%$ energy savings and $57\%$ latency reduction while preserving accuracy, demonstrating the viability of silicon-photonic accelerators for large-scale AI workloads.

Abstract

Optical neural networks (ONN) based on micro-ring resonators (MRR) have emerged as a promising alternative to significantly accelerating the massive matrix-vector multiplication (MVM) operations in artificial intelligence (AI) applications. However, the limited scale of MRR arrays presents a challenge for AI acceleration. The disparity between the small MRR arrays and the large weight matrices in AI necessitates extensive MRR writings, including reprogramming and calibration, resulting in considerable latency and energy overheads. To address this problem, we propose a novel design methodology to lessen the need for frequent weight reloading. Specifically, we propose a reuse and blend (R&B) architecture to support efficient layer-wise and block-wise weight sharing, which allows weights to be reused several times between layers/blocks. Experimental results demonstrate the R&B system can maintain comparable accuracy with 69% energy savings and 57% latency improvement. These results highlight the promise of the R&B to enable the efficient deployment of advanced deep learning models on photonic accelerators.

Reuse and Blend: Energy-Efficient Optical Neural Network Enabled by Weight Sharing

TL;DR

This work tackles the energy and latency bottlenecks of large-weight optical neural networks built from micro-ring resonators by introducing a weight-sharing paradigm. The authors propose a Reuse and Blend (R&B) architecture comprising Photonic Reuse Method (PRM) for layer-wise and block-wise weight reuse and an Opto-electronic Blend Unit (OBU) for optical transpose and electronic shuffle, enabling a single MRR crossbar to represent multiple weight matrices. Hardware-aware design details include handling full-range weights with an offset matrix, ReLU activation in the electrical domain, efficient normalization, and in-OM optical/shuffle operations, all implemented with minimal overhead. Empirical results across MLP, VGG-13, ResNet-18, and MLP-Mixer on MNIST/CIFAR datasets show up to energy savings and latency reduction while preserving accuracy, demonstrating the viability of silicon-photonic accelerators for large-scale AI workloads.

Abstract

Optical neural networks (ONN) based on micro-ring resonators (MRR) have emerged as a promising alternative to significantly accelerating the massive matrix-vector multiplication (MVM) operations in artificial intelligence (AI) applications. However, the limited scale of MRR arrays presents a challenge for AI acceleration. The disparity between the small MRR arrays and the large weight matrices in AI necessitates extensive MRR writings, including reprogramming and calibration, resulting in considerable latency and energy overheads. To address this problem, we propose a novel design methodology to lessen the need for frequent weight reloading. Specifically, we propose a reuse and blend (R&B) architecture to support efficient layer-wise and block-wise weight sharing, which allows weights to be reused several times between layers/blocks. Experimental results demonstrate the R&B system can maintain comparable accuracy with 69% energy savings and 57% latency improvement. These results highlight the promise of the R&B to enable the efficient deployment of advanced deep learning models on photonic accelerators.
Paper Structure (20 sections, 7 equations, 3 figures, 5 tables, 1 algorithm)

This paper contains 20 sections, 7 equations, 3 figures, 5 tables, 1 algorithm.

Figures (3)

  • Figure 1: Comparison of energy consumption: without weight sharing (left) vs. with weight sharing (right). The reuse-and-blend (R&B) method significantly reduces dynamic power consumption, mainly due to decreased MRR writing frequency and, thus, lower energy needed for programming and calibration.
  • Figure 2: (a) Overview of the R&B architecture. Each PPU contains a photonic MVM unit and a sampling and hold (S&H) unit. (b) Photonic Reuse Method (PRM). Block-wise reuse allows weight sharing among blocks (a block typically contains multiple layers). Layer-wise reuse enables weight sharing between individual layers. (c) Opto-electronic Blend Unit (OBU). OBUs handle shuffle operations via the peripheral circuit and perform transpose operations directly in the optical domain. (d) Computing pipeline of our R&B architecture.
  • Figure 3: Illustration of transposed dot production in MRR: (a) horizontal input and (b) vertical input.