Quantum-limited stochastic optical neural networks operating at a few quanta per activation
Shi-Yuan Ma, Tianyu Wang, Jérémie Laydevant, Logan G. Wright, Peter L. McMahon
TL;DR
The paper shows that optical neural networks can operate in a regime where each neuron is activated by only a few photons, introducing unavoidable shot noise. By training with a physics-informed stochastic model (physics-aware stochastic training), they achieve accurate MNIST classification with single-photon activations and ultra-low optical energy per MAC. Experimentally, a two-layer SPDNN yields up to 98% MNIST accuracy with energy on the order of 0.013–0.038 photons per MAC, demonstrating dramatic energy efficiency gains. The study also outlines coherent, deeper SPDNNs and argues that physics-aware software can unlock substantial benefits in ultra-low-power hardware, with potential extensions beyond optical implementations.
Abstract
Energy efficiency in computation is ultimately limited by noise, with quantum limits setting the fundamental noise floor. Analog physical neural networks hold promise for improved energy efficiency compared to digital electronic neural networks. However, they are typically operated in a relatively high-power regime so that the signal-to-noise ratio (SNR) is large, and the noise can be treated as a perturbation. We study optical neural networks where all layers except the last are operated in the limit that each neuron can be activated by just a single photon, and as a result the noise on neuron activations is no longer merely perturbative. We show that by using a physics-based probabilistic model of the neuron activations in training, it is possible to perform accurate machine-learning inference in spite of the extremely high shot noise (SNR ~ 1). We experimentally demonstrated MNIST handwritten-digit classification with a test accuracy of 98% using an optical neural network with a hidden layer operating in the single-photon regime; the optical energy used to perform the classification corresponds to just 0.038 photons per multiply-accumulate (MAC) operation. Our physics-aware stochastic training approach might also prove useful with non-optical ultra-low-power hardware.
