Low-power scalable multilayer optoelectronic neural networks enabled with incoherent light

Alexander Song; Sai Nikhilesh Murty Kottapalli; Rahul Goyal; Bernhard Schölkopf; Peer Fischer

Low-power scalable multilayer optoelectronic neural networks enabled with incoherent light

Alexander Song, Sai Nikhilesh Murty Kottapalli, Rahul Goyal, Bernhard Schölkopf, Peer Fischer

TL;DR

This work tackles the data movement and energy bottlenecks of optical neural accelerators by introducing a multilayer incoherent architecture that interleaves optical matrix-vector multiplications with optoelectronic nonlinear activations. Using 2D LED and photodiode arrays connected through local analog electronics and a single amplitude mask per optical interconnect, the system performs multiple MVMs in sequence with low I/O overhead. Experimentally, a three-layer network demonstrates MNIST digit classification near digital simulations (about 91–92% accuracy) and nonlinear spiral data separation with strong performance, while weight transfer from pretrained networks suggests practical applicability to large-scale models. The approach is scalable, energy-efficient, and amenable to deployment as an optical accelerator for inference, with potential TOPS/W gains as system size and speed increase.

Abstract

Optical approaches have made great strides towards the goal of high-speed, energy-efficient computing necessary for modern deep learning and AI applications. Read-in and read-out of data, however, limit the overall performance of existing approaches. This study introduces a multilayer optoelectronic computing framework that alternates between optical and optoelectronic layers to implement matrix-vector multiplications and rectified linear functions, respectively. Our framework is designed for real-time, parallelized operations, leveraging 2D arrays of LEDs and photodetectors connected via independent analog electronics. We experimentally demonstrate this approach using a system with a three-layer network with two hidden layers and operate it to recognize images from the MNIST database with a recognition accuracy of 92% and classify classes from a nonlinear spiral data with 86% accuracy. By implementing multiple layers of a deep neural network simultaneously, our approach significantly reduces the number of read-ins and read-outs required and paves the way for scalable optical accelerators requiring ultra low energy.

Low-power scalable multilayer optoelectronic neural networks enabled with incoherent light

TL;DR

Abstract

Paper Structure (18 sections, 3 equations, 6 figures)

This paper contains 18 sections, 3 equations, 6 figures.

Introduction
Results
Multilayer optoelectronic neural network
Image classification
Deep optical accelerators with weight transfer
Discussion
Methods
MNIST dataset and processing
Control software
Network training
Electronics design and operation
Optics design and operation
Alignment and calibration of optics/electronics
Acknowledgements
Author Information
...and 3 more sections

Figures (6)

Figure 1: (a) The multilayer optoelectronic neural network uses a series of interleaved optical and electronic layers to implement matrix multiplication and nonlinearity, respectively. The inset illustrates (b) a nonnegative fully connected MVM that is implemented dynamically using a 2D array of incoherent light emitting diodes (LEDs), each encoding a neuron activation in our system. Each LED is associated with a 2D subarray of amplitude-encoded weights that map onto a 2D array of photodiodes (PDs). (c) An electronic board contains a parallel array of neurons each associated with a pair of photodiodes representing the positive and negative inputs to the neuron.
Figure 2: Schematic of our multi-layer optoelectronic neural network implementation with optical operations (green) and electronic operations (blue). (a) Data is read-in electronically to an Input layer with 64 units arranged on an 8x8 array of LEDs. A fully connected matrix-vector-multiplication (MVM) maps light from these units to a $10 \times 10$ array of photodidoes (PDs). Hidden layer 1 combines pairs of values from the PDs to drive a 5x10 array of LEDs. A second MVM and hidden layer implement Hidden layer 2 and a third MVM is mapped onto an 8x8 array of PDs of the Output Layer. (b) Ray-tracing illustrates how a fully-connected MVM operation is performed. (c) Amplitude weights are nonnegative, and a pair of photodiodes are fed into an analog electronic circuit that performs a differencing operation before driving an LED. (d) Example output LED response to a pair of detector inputs. Negative currents in the circuit are truncated by the LED, effectively implementing a linear rectification.
Figure 3: MNIST digit classification with a three-layer optoelectronic neural network. (a) Example propagation of a trained miniaturized MNIST digit through the three-layer network. Digital simulation values are compared to the analog experimental values. (b) Correlation between simulation and experiment of activations in Hidden layer 1 in response to individual miniaturized MNIST digits (c) Same as (b), but in Hidden layer 2. (d) Confusion matrix of estimated classes for simulated results, in percent. (e) Same as (d), but for experimental results
Figure 4: (a) A nonlinear four-class spiral data classification problem with two input variables. Each of the classes corresponds to one arm of the spiral. (b) The best linear classifier classifies this problem with an accuracy of 30.1%. (c) Experimental classification using the multi-layer network as described in FIG. \ref{['fig:fig-3']}a obtains a classification accuracy of 86.0%. (d) Comparison between simulation and experiment of the trained network output values.
Figure 5: Calibration of the multilayer optoelectronic neural network (a) Temporal response of the electronics in response to changes in photodetector input. (b) Distribution of maximum optical weights and extinction ratio of an amplitude mask implemented with a liquid-crystal display (LCD). (c) Scatterplot of PD1 and PD2 from pairs of PDs implemented a nonnegative ReLU with a color-coded bias offset for individual pairs. (d) Average cross-talk distribution from weights implemented on the LCD.
...and 1 more figures

Low-power scalable multilayer optoelectronic neural networks enabled with incoherent light

TL;DR

Abstract

Low-power scalable multilayer optoelectronic neural networks enabled with incoherent light

Authors

TL;DR

Abstract

Table of Contents

Figures (6)