Table of Contents
Fetching ...

Nonlinear Inference Capacity of Fiber-Optical Extreme Learning Machines

Sobhi Saeed, Mehmet Müftüoglu, Glitta R. Cheeran, Thomas Bocklitz, Bennet Fischer, Mario Chemnitz

TL;DR

This work addresses how to quantify the nonlinear inference capacity of physics-based neuromorphic hardware, specifically fiber-optical extreme learning machines (ELMs). It develops a frequency-domain optical implementation with two dispersion regimes and a scalable spiral benchmark to map inputs through intrinsic nonlinear dynamics to a linear readout, enabling cross-platform comparisons against digital models. Key findings show that higher nonlinearity (encoded via the soliton number $N$) improves performance on highly nonlinear tasks, with the anomalous-dispersion fiber outperforming the normal-dispersion fiber on such tasks, while MNIST exhibits limited gains from nonlinearity. The paper proposes a benchmark framework and the use of soliton-number-based nonlinear inference capacity as a platform-independent metric, guiding future evaluation of unconventional, physics-inspired computing architectures.

Abstract

The intrinsic complexity of nonlinear optical phenomena offers a fundamentally new resource to analog brain-inspired computing, with the potential to address the pressing energy requirements of artificial intelligence. We introduce and investigate the concept of nonlinear inference capacity in optical neuromorphic computing in highly nonlinear fiber-based optical Extreme Learning Machines. We demonstrate that this capacity scales with nonlinearity to the point where it surpasses the performance of a deep neural network model with five hidden layers on a scalable nonlinear classification benchmark. By comparing normal and anomalous dispersion fibers under various operating conditions and against digital classifiers, we observe a direct correlation between the system's nonlinear dynamics and its classification performance. Our findings suggest that image recognition tasks, such as MNIST, are incomplete in showcasing deep computing capabilities in analog hardware. Our approach provides a framework for evaluating and comparing computational capabilities, particularly their ability to emulate deep networks, across different physical and digital platforms, paving the way for a more generalized set of benchmarks for unconventional, physics-inspired computing architectures.

Nonlinear Inference Capacity of Fiber-Optical Extreme Learning Machines

TL;DR

This work addresses how to quantify the nonlinear inference capacity of physics-based neuromorphic hardware, specifically fiber-optical extreme learning machines (ELMs). It develops a frequency-domain optical implementation with two dispersion regimes and a scalable spiral benchmark to map inputs through intrinsic nonlinear dynamics to a linear readout, enabling cross-platform comparisons against digital models. Key findings show that higher nonlinearity (encoded via the soliton number ) improves performance on highly nonlinear tasks, with the anomalous-dispersion fiber outperforming the normal-dispersion fiber on such tasks, while MNIST exhibits limited gains from nonlinearity. The paper proposes a benchmark framework and the use of soliton-number-based nonlinear inference capacity as a platform-independent metric, guiding future evaluation of unconventional, physics-inspired computing architectures.

Abstract

The intrinsic complexity of nonlinear optical phenomena offers a fundamentally new resource to analog brain-inspired computing, with the potential to address the pressing energy requirements of artificial intelligence. We introduce and investigate the concept of nonlinear inference capacity in optical neuromorphic computing in highly nonlinear fiber-based optical Extreme Learning Machines. We demonstrate that this capacity scales with nonlinearity to the point where it surpasses the performance of a deep neural network model with five hidden layers on a scalable nonlinear classification benchmark. By comparing normal and anomalous dispersion fibers under various operating conditions and against digital classifiers, we observe a direct correlation between the system's nonlinear dynamics and its classification performance. Our findings suggest that image recognition tasks, such as MNIST, are incomplete in showcasing deep computing capabilities in analog hardware. Our approach provides a framework for evaluating and comparing computational capabilities, particularly their ability to emulate deep networks, across different physical and digital platforms, paving the way for a more generalized set of benchmarks for unconventional, physics-inspired computing architectures.

Paper Structure

This paper contains 7 sections, 4 equations, 10 figures, 3 tables.

Figures (10)

  • Figure 1: Illustration of the data flow in the fiber-based neuromorphic system using an example from the spiral dataset. (a) Input data: four sample points were selected from four different spirals. (b) Corresponding data encoding: the first half of the spectral encoding range (limited by the WaveShaper) encodes the $X_1$-coordinate of a data tuple; the second half encodes the $X_2$-coordinate. (c) Encoding phase after multiplying with a constant phase scale factor and an arbitrary but fixed mask. (d) The experimental setup used for processing. A computer is used as I/O device and is not part of a feedback loop. (e) Linear spectral intensities at fiber output corresponding to the four sample inputs. (f) Linear spectral intensities at the selected, optimized search bins serving as system read-outs. (g) Prediction scores obtained by multiplying the read-outs with the trained weight matrix. Per sample, scores are sorted from class 1 to 4, from top to bottom. The highest values in a vector of four (i.e., argmax($\textbf{Y}^{score}$)) determines the predicted class. (h) Prediction results: points represent the predictions, while circles around these points indicate the true class labels, the examples contain one misclassification indicated by the red cross.
  • Figure 2: 3D plots illustrating the relationship between output spectral intensity of a supercontinuum from an anomalous dispersive fiber and the input coordinates for all given samples. (a-d) Logarithmic spectral intensity versus input coordinates (X1, X2) at the optimized, selected search bins, demonstrating the system's intrinsically distinguishable response to different classes. (e-f) Spectral intensity versus input coordinates at two randomly selected wavelength windows. Both examples are still >10dB above the spectrometer’s noise floor.
  • Figure 3: (a) Average and standard deviation (STD) of output spectral intensities for the spirals dataset in the normal dispersion (ND) case. (b) Average and std of output spectral intensities for the spirals dataset in the anomalous dispersion (AD) case. (c) Classification accuracy achieved for both fiber types as a function of the number of search bins. (d) Classification results for the ND case using 50 search bins. (e) Classification results for the AD case using 50 search bins.
  • Figure 4: (a, b) Average and standard deviation (STD) of measured output spectral intensities for the MNIST dataset in the (a) normal dispersion (ND) case and (b) anomalous dispersion (AD) case. (c) Classification accuracy achieved across 300 MNIST test samples as a function of the number of search bins for both fiber types. (d, e) Confusion matrices of our systems for unseen test data for the (d) ND case (achieved accuracy 89.33%), and (e) AD case (achieved accuracy 87.3%) using 150 search bins for both cases.
  • Figure 5: (a) Best test accuracy on 200 spiral data samples achieved by digital classifiers (a linear kernel support vector machine with 100 support vectors, and neural networks in different configurations (cp. Tab. \ref{['tab:1A']}); all trained for 1000 training epochs) and our fiber-optical ELM using 100 search bins for increasing nonlinear problem hardness in the spiral task, defined by the maximum angular span $\theta_{max}$. (b,c) Test accuracies on 200 spiral data samples as a function of system nonlinearity (or attenuation) and maximum angular span for both, (b) normal dispersion and (c) anomalous dispersion.
  • ...and 5 more figures