Table of Contents
Fetching ...

Emergent learning: neuromorphic photonic computing with accelerated training

Sara Peña-Gutiérrez, Giorgio Gosti, Hongsheng Chen, Giancarlo Ruocco, Marco Leonetti

TL;DR

This Photonic Emergent Learning platform is not only flexible and fabrication-free, but also relies primarily on analog processes, thus shifting the computational burden of training from the digital layers to the optical domain reducing the computational cost and enhancing performance.

Abstract

Emergent learning transforms a disordered optical medium into a photonic device capable of storage, recognition, and classification of arbitrary memory patterns. First, we show that the intensity at the output of a multiply scattering system can be described by a dyadic matrix, the optical-synaptic matrix, exhibiting the same form as a Hebbian synaptic matrix containing a single memory. Then, we employ emergent learning - an approach inspired by neuroscience - to exploit the vast dictionary of raw memories inherently available within a disordered optical structure, thereby engineering the optical-synaptic matrix to store a user-defined attractor, or tailored memory. Importantly these photonic structures also works as an optical comparators providing an intensity-based measure of the degree of similitude between a query pattern and the stored pattern, realizing an hardware co-localization between memory and optical operator. Our system has an almost infinite hardware capacity of tailored memories/ operators ($\mathcal{M} \sim 10^{60557}$), thus these tailored memories can be then employed as examples to build a classifier hardware based on intensity comparison without the need of additional digital transformation layers. Remarkably, this Photonic Emergent Learning platform is not only flexible and fabrication-free, but also relies primarily on analog processes, thus shifting the computational burden of training from the digital layers to the optical domain reducing the computational cost and enhancing performance.

Emergent learning: neuromorphic photonic computing with accelerated training

TL;DR

This Photonic Emergent Learning platform is not only flexible and fabrication-free, but also relies primarily on analog processes, thus shifting the computational burden of training from the digital layers to the optical domain reducing the computational cost and enhancing performance.

Abstract

Emergent learning transforms a disordered optical medium into a photonic device capable of storage, recognition, and classification of arbitrary memory patterns. First, we show that the intensity at the output of a multiply scattering system can be described by a dyadic matrix, the optical-synaptic matrix, exhibiting the same form as a Hebbian synaptic matrix containing a single memory. Then, we employ emergent learning - an approach inspired by neuroscience - to exploit the vast dictionary of raw memories inherently available within a disordered optical structure, thereby engineering the optical-synaptic matrix to store a user-defined attractor, or tailored memory. Importantly these photonic structures also works as an optical comparators providing an intensity-based measure of the degree of similitude between a query pattern and the stored pattern, realizing an hardware co-localization between memory and optical operator. Our system has an almost infinite hardware capacity of tailored memories/ operators (), thus these tailored memories can be then employed as examples to build a classifier hardware based on intensity comparison without the need of additional digital transformation layers. Remarkably, this Photonic Emergent Learning platform is not only flexible and fabrication-free, but also relies primarily on analog processes, thus shifting the computational burden of training from the digital layers to the optical domain reducing the computational cost and enhancing performance.

Paper Structure

This paper contains 15 sections, 17 equations, 4 figures.

Figures (4)

  • Figure 1: (a) Demonstration of effectiveness of Emergence Learning. Hamming distance of $\bm{S}^*$ from $\overline{\bm{d}}$ plotted against the ratio of $O/N$ reservoir to features ratio. Simulations performed with $N$ = 144, 256 and 324, $O$ = 500000 and using $M$ = 250 light modes. The memories used are the $m$th-selected modes $\overline{\bm{d}}^m$ more similar to the input and the final tailored memories$\overline{\bm{d}}^\Sigma$ when aggregating these modes. (b) Detailed plot of the Hamming distance (a) in the case of using the tailored memories$\overline{\bm{d}}^\Sigma$. (c) Intensity as a proxy of pattern distance Hamming distance of $\overline{\bm{d}}^o$ from $\bm{S}^*$ plotted against intensity at the same mode. Simulations performed with $N$ = 144 and $O$ = 500000. (d) Reading efficiency. Fidelity of the retrieved-memory to the presented pattern measured through the Hamming distance for $O$ = 500000 using input patterns $\bm{S}^*$ of size N = 144, 256 and 400.
  • Figure 2: Emergent reservoir optical scheme. Illustration of the set-up and working conditions of the SPRC based on $PhEL$. The SPRC comprises a digital micromirror device, which projects the query pattern $\bm{S}$ into the disordered medium. Multiple scattered light is filtered by a linear polarizer (LP) and then it is focused onto the camera sensor. Inlet bottom left: Each mode of the speckle pattern $o$ has a correspondent raw memory$\bm{d^{o}}$. The intensity of a speckle mode $I^o$ is related to the projection of raw memory$\bm{d^{o}}$ onto the query pattern $\bm{S}$. The angle between them determines the intensity value, being maximum when both vectors are parallel. Top right: Representation of the hierarchical organization of memory types. From the reservoir features of the input pattern $\bm{S}^*$, the raw memories$\overline{\mathbf{d}^o}$ (first category) associated to the maximum intensity at their modes $\{I^o\}^M_{max}$ are selected to generate a tailored memory$\overline{\mathbf{d}}^\Sigma$ (second category). At the same time, the tailored memories are gathered into memory classes$c$ depending on the class they belong to (third category).
  • Figure 3: Characterization of the SPRC system.(a) Memory capacity assessment based on the normalized aggregated intensity$\overline{I^{\Sigma}}$ of the query pattern $\bm{S}^{\Sigma_{err}}$ using its tailored memory$\overline{\bm{d}}^\Sigma$ as a function of the level of corruption present in the query. Inset zooms the region at very low corruption levels. Orange point sets the level at which the corrupted patterns $\bm{S}^{\Sigma_{err}}$ and the original $\bm{S}^*$ start to be distinguishable. Here we employ bigger input patterns $N$ = 112896 pixels to demonstrate the feasibility of $PhEL$ in the very large $N$ region. (b) Average Hamming distance as a function of the reservoir to features ratio $O/N$ when recovering the tailored memory of input pattern $\bm{S}^*$ using $M$ = 250 modes in our system for different $O$. (c) Recognition assessment of specific pattern $\bm{S}^*$ through the aggregated intensity of the query patterns $I^{MNI}(\bm{S}^{MNI})$ with respect to the tailored memory$\overline{\bm{d}}^{MNI}$ to be recognized. The number of modes used to design the memory is $M$ = 1000. (d) Mean recognition error and standard deviation are calculated respectively by averaging and computing standard deviation over 100 patterns. We report data for the first $10^3$ query probe patterns, while writing is performed with $M= 10^3$
  • Figure 4: Classification technique and results.(a) Scheme of the classification flow. The aggregated intensity$I^\Sigma$ of each $p$tailored memory is calculated and the maximum value is considered the output of the class $c$. The decision layer is composed of 0 to 9 classes. The highest value in the decision is the classification output. (b) Confusion matrix of the classification of 1000 patterns of the MNIST dataset using $M$ = 1000 light modes. (c) Simulation of the computational complexity in FLOPs of classification training, using P = 441 samples per class of size $N$ = 15 x 15 for C = 10 digit categories, applying the $PhEL$ (blue line) and the $RRS$ approach Saade2016 (red line). Vertical dashed lines set set the limits of computational resources using $RRS$ (light red) at $O$ = $3\cdot 10^4$, $PhEL$ (light blue) at $O$ = $4\cdot10^9$ and the average number of pixels of our array detector $O = 5.6\cdot10^6$ (bold black) and the highest number of pixels of a commercial array detector (BASLER) $O = 1.52\cdot10^8$ (light black). (d) Classification efficiency as a function of the number of light modes selected: to generate the tailored memory in the optical system ($PhEL$, blue dots) or as features to be introduced in the digital training ($RRS$, red dots).