Table of Contents
Fetching ...

Training Large-Scale Optical Neural Networks with Two-Pass Forward Propagation

Amirreza Ahmadnejad, Somayyeh Koohi

TL;DR

The paper tackles training efficiency, nonlinear activation implementation, and real-size data handling in optical neural networks by introducing Two-Pass Forward Propagation, which modulates and re-enters error into the forward path to update weights without a separate backward pass. It also proposes an optical CNN approach that realizes convolutional processing with simple neural networks on integrated hardware, enabling real-size image processing. Through FDTD-based simulations of an XOR gate and a MNIST-classifying optical CNN, the method demonstrates competitive accuracy and improved training dynamics across integrated and free-space platforms. Collectively, these contributions advance scalable, energy-efficient optical neuromorphic computing with potential impact on large-scale data processing tasks.

Abstract

This paper addresses the limitations in Optical Neural Networks (ONNs) related to training efficiency, nonlinear function implementation, and large input data processing. We introduce Two-Pass Forward Propagation, a novel training method that avoids specific nonlinear activation functions by modulating and re-entering error with random noise. Additionally, we propose a new way to implement convolutional neural networks using simple neural networks in integrated optical systems. Theoretical foundations and numerical results demonstrate significant improvements in training speed, energy efficiency, and scalability, advancing the potential of optical computing for complex data tasks.

Training Large-Scale Optical Neural Networks with Two-Pass Forward Propagation

TL;DR

The paper tackles training efficiency, nonlinear activation implementation, and real-size data handling in optical neural networks by introducing Two-Pass Forward Propagation, which modulates and re-enters error into the forward path to update weights without a separate backward pass. It also proposes an optical CNN approach that realizes convolutional processing with simple neural networks on integrated hardware, enabling real-size image processing. Through FDTD-based simulations of an XOR gate and a MNIST-classifying optical CNN, the method demonstrates competitive accuracy and improved training dynamics across integrated and free-space platforms. Collectively, these contributions advance scalable, energy-efficient optical neuromorphic computing with potential impact on large-scale data processing tasks.

Abstract

This paper addresses the limitations in Optical Neural Networks (ONNs) related to training efficiency, nonlinear function implementation, and large input data processing. We introduce Two-Pass Forward Propagation, a novel training method that avoids specific nonlinear activation functions by modulating and re-entering error with random noise. Additionally, we propose a new way to implement convolutional neural networks using simple neural networks in integrated optical systems. Theoretical foundations and numerical results demonstrate significant improvements in training speed, energy efficiency, and scalability, advancing the potential of optical computing for complex data tasks.
Paper Structure (9 sections, 11 equations, 6 figures, 1 table)

This paper contains 9 sections, 11 equations, 6 figures, 1 table.

Figures (6)

  • Figure 1: Comparison between the concept of (a)BackPropagation/Adjoint and (b) Two Forward Propagation training methods.
  • Figure 2: The outline of the implementation of the optical neural network using MZI mesh. Each of the modulators are stimulated electro-optically with a special voltage. The nonlinear block related to the activator function is also implemented using the MZI mesh and the reason for the color difference between them is the different material used as a phase changer in the modulator.
  • Figure 3: Scheme of converting convolutional neural network to simple neural networks.
  • Figure 4: Equivalence of a convolutional network with a number of simple neural networks.
  • Figure 5: Visulation of random matrix $\mathbf{F}^T$ for 10 outputs and 784 inputs.
  • ...and 1 more figures