Table of Contents
Fetching ...

Model-free front-to-end training of a large high performance laser neural network

Anas Skalli, Satoshi Sunada, Mirko Goldmann, Marcin Gebski, Stephan Reitzenstein, James A. Lott, Tomasz Czyszanowski, Daniel Brunner

TL;DR

This work tackles the challenge of building autonomous, high-performance optical neural networks (ONNs) by combining a multimode large-area VCSEL with hardware-friendly, model-free training. A software-based ceiling analysis reveals that allowing both positive and negative weights, enabling tunable input connectivity, and achieving sufficient weight resolution are crucial for performance on real-world tasks like MNIST. Guided by these insights, the authors implement a fully tunable ONN with input and output weight modulation, and benchmark several hardware-compatible training strategies (FD, SPSA, CMA-ES, PEPG, PSO) on toy problems and MNIST, finding that PEPG offers the best convergence efficiency under hardware constraints. The study demonstrates that a VOI-based ONN with a VCSEL can surpass a hardware-linear baseline on MNIST, highlighting the practical potential of autonomous photonic neuromorphic processors and providing actionable design and optimization guidance for future ONN hardware.

Abstract

Artificial neural networks (ANNs), have become ubiquitous and revolutionized many applications ranging from computer vision to medical diagnoses. However, they offer a fundamentally connectionist and distributed approach to computing, in stark contrast to classical computers that use the von Neumann architecture. This distinction has sparked renewed interest in developing unconventional hardware to support more efficient implementations of ANNs, rather than merely emulating them on traditional systems. Photonics stands out as a particularly promising platform, providing scalability, high speed, energy efficiency, and the ability for parallel information processing. However, fully realized autonomous optical neural networks (ONNs) with in-situ learning capabilities are still rare. In this work, we demonstrate a fully autonomous and parallel ONN using a multimode vertical cavity surface emitting laser (VCSEL) using off-the-shelf components. Our ONN is highly efficient and is scalable both in network size and inference bandwidth towards the GHz range. High performance hardware-compatible optimization algorithms are necessary in order to minimize reliance on external von Neumann computers to fully exploit the potential of ONNs. As such we present and extensively study several algorithms which are broadly compatible with a wide range of systems. We then apply these algorithms to optimize our ONN, and benchmark them using the MNIST dataset. We show that our ONN can achieve high accuracy and convergence efficiency, even under limited hardware resources. Crucially, we compare these different algorithms in terms of scaling and optimization efficiency in term of convergence time which is crucial when working with limited external resources. Our work provides some guidance for the design of future ONNs as well as a simple and flexible way to train them.

Model-free front-to-end training of a large high performance laser neural network

TL;DR

This work tackles the challenge of building autonomous, high-performance optical neural networks (ONNs) by combining a multimode large-area VCSEL with hardware-friendly, model-free training. A software-based ceiling analysis reveals that allowing both positive and negative weights, enabling tunable input connectivity, and achieving sufficient weight resolution are crucial for performance on real-world tasks like MNIST. Guided by these insights, the authors implement a fully tunable ONN with input and output weight modulation, and benchmark several hardware-compatible training strategies (FD, SPSA, CMA-ES, PEPG, PSO) on toy problems and MNIST, finding that PEPG offers the best convergence efficiency under hardware constraints. The study demonstrates that a VOI-based ONN with a VCSEL can surpass a hardware-linear baseline on MNIST, highlighting the practical potential of autonomous photonic neuromorphic processors and providing actionable design and optimization guidance for future ONN hardware.

Abstract

Artificial neural networks (ANNs), have become ubiquitous and revolutionized many applications ranging from computer vision to medical diagnoses. However, they offer a fundamentally connectionist and distributed approach to computing, in stark contrast to classical computers that use the von Neumann architecture. This distinction has sparked renewed interest in developing unconventional hardware to support more efficient implementations of ANNs, rather than merely emulating them on traditional systems. Photonics stands out as a particularly promising platform, providing scalability, high speed, energy efficiency, and the ability for parallel information processing. However, fully realized autonomous optical neural networks (ONNs) with in-situ learning capabilities are still rare. In this work, we demonstrate a fully autonomous and parallel ONN using a multimode vertical cavity surface emitting laser (VCSEL) using off-the-shelf components. Our ONN is highly efficient and is scalable both in network size and inference bandwidth towards the GHz range. High performance hardware-compatible optimization algorithms are necessary in order to minimize reliance on external von Neumann computers to fully exploit the potential of ONNs. As such we present and extensively study several algorithms which are broadly compatible with a wide range of systems. We then apply these algorithms to optimize our ONN, and benchmark them using the MNIST dataset. We show that our ONN can achieve high accuracy and convergence efficiency, even under limited hardware resources. Crucially, we compare these different algorithms in terms of scaling and optimization efficiency in term of convergence time which is crucial when working with limited external resources. Our work provides some guidance for the design of future ONNs as well as a simple and flexible way to train them.

Paper Structure

This paper contains 32 sections, 24 equations, 35 figures, 5 tables, 6 algorithms.

Figures (35)

  • Figure 1: Results of the MNIST classification task using Boolean readout weights of the LA-VCSEL.
  • Figure 2: (a) Comparison between a single-layer FFNN in blue and an ELM in red on the MNIST dataset, the baseline performance for a linear classifier is shown in green. (b) Performance for FFNN and ELM architectures as a function of the number of neurons. (c) Ratio between the number of neurons needed for the ELM to reach the same accuracy compared to the FFNN. (d) Performance of the FFNN as a function of the weight resolution.
  • Figure 3: Summary of different hardware compatible strategies with their advantages and disadvantages, compiled fromwright2022deepnakajima2022physicalhinton2022forwardoguz2023forwardscellier2017equilibrium.
  • Figure 4: The Rastrigin function in 2D.
  • Figure 5: CMA-ES algorithm behavior on the Rastrigin function in $D=10$ dimensions, only the first two are shown. The mean of the distribution is represented by the red dot, the population by the black dots and the ellipsoid helps to visualize the shape of the distribution.
  • ...and 30 more figures