Table of Contents
Fetching ...

Scalable optical neural network with nonlocally coupled coherent photonic processor

Chun Ren, Ryota Tanomura, Kazuki Ichinose, Keigo Mizukami, Yoshitaka Taguchi, Taichiro Fukui, Yoshiaki Nakano, Takuo Tanemura

TL;DR

This work implements a scalable ONN that overcomes the intrinsically diffractive and nonlocal nature of coherent light inside a silicon photonic chip with a tenfold reduction in active components and establishes a practical pathway toward large-scale, energy-efficient, and reconfigurable photonic neural networks.

Abstract

Optical neural networks (ONNs) based on programmable photonic integrated circuits (PICs) offer a promising route toward low-latency and energy-efficient deep learning. However, conventional photonic implementations of matrix-vector multiplication (MVM) rely on locally connected architectures, such as Mach-Zehnder interferometer (MZI) meshes, whose number of active components scales quadratically with matrix size, severely limiting scalability. Here, we present a scalable ONN that overcomes this limitation by exploiting the intrinsically diffractive and nonlocal nature of coherent light inside a silicon photonic chip. Our approach employs cascaded stages of multiport directional couplers (MDCs) interleaved with compact phase-shifter arrays, enabling strong nonlocal coupling among multiple optical modes. We show that an MDC-based optical unitary converter (OUC) requires only $3N$ phase shifters to achieve uniform coverage over the $N$-dimensional complex unitary group, in stark contrast to the $O(N^2)$ scaling of conventional MZI meshes. Based on the singular value decomposition, we demonstrate that an $N\times N$ MVM can be realized using only $7N$ phase shifters, breaking the traditional $O(N^2)$ scaling barrier. We experimentally implement a 32-input silicon photonic MVM chip with a tenfold reduction in active components and validate its performance on various classification tasks. Our results establish a practical pathway toward large-scale, energy-efficient, and reconfigurable photonic neural networks.

Scalable optical neural network with nonlocally coupled coherent photonic processor

TL;DR

This work implements a scalable ONN that overcomes the intrinsically diffractive and nonlocal nature of coherent light inside a silicon photonic chip with a tenfold reduction in active components and establishes a practical pathway toward large-scale, energy-efficient, and reconfigurable photonic neural networks.

Abstract

Optical neural networks (ONNs) based on programmable photonic integrated circuits (PICs) offer a promising route toward low-latency and energy-efficient deep learning. However, conventional photonic implementations of matrix-vector multiplication (MVM) rely on locally connected architectures, such as Mach-Zehnder interferometer (MZI) meshes, whose number of active components scales quadratically with matrix size, severely limiting scalability. Here, we present a scalable ONN that overcomes this limitation by exploiting the intrinsically diffractive and nonlocal nature of coherent light inside a silicon photonic chip. Our approach employs cascaded stages of multiport directional couplers (MDCs) interleaved with compact phase-shifter arrays, enabling strong nonlocal coupling among multiple optical modes. We show that an MDC-based optical unitary converter (OUC) requires only phase shifters to achieve uniform coverage over the -dimensional complex unitary group, in stark contrast to the scaling of conventional MZI meshes. Based on the singular value decomposition, we demonstrate that an MVM can be realized using only phase shifters, breaking the traditional scaling barrier. We experimentally implement a 32-input silicon photonic MVM chip with a tenfold reduction in active components and validate its performance on various classification tasks. Our results establish a practical pathway toward large-scale, energy-efficient, and reconfigurable photonic neural networks.
Paper Structure (17 sections, 14 equations, 18 figures, 3 tables)

This paper contains 17 sections, 14 equations, 18 figures, 3 tables.

Figures (18)

  • Figure 1: ONN using 32-input fully integrated photonic chip. a Schematic of the 32-input ONN. Each layer has a weight matrix of size $32\times32$. b Schematic of the photonic chip, where MVM is conducted entirely in the optical domain. The input vector generated by 32 MZMs is successively multiplied by a $32\times32$ unitary matrix $V^\dagger$, a diagonal singular value matrix $\Sigma$, and another $32\times32$ unitary matrix $U$, and finally detected by 32 PDs. The unitary matrices are realized by two 32-input 3-stage MDC-OUCs.
  • Figure 2: Numerical comparison of MDC-OUC and MZI-OUC. a Schematic of $N$-input $M$-stage MZI-OUC. b Schematic of $N$-input $M$-stage MDC-OUC. c$32\times32$ unitary matrices generated by MZI-OUC ($N = 32$) with various number of stages $M$. The absolute values (upper panel) and arguments (lower panel) of all components of an example matrix $U$ are plotted. d$32\times32$ unitary matrices generated by MDC-OUC ($N = 32$) with various number of stages $M$. e Haar randomness of $32\times32$ unitary matrices generated by MZI-OUC and MDC-OUC ($N = 32$) with various $M$. f MNIST image classification accuracy obtained by two-layer 32-input ONN for both cases ($N = 32$). g Haar randomness of $128\times128$ unitary matrices generated by MZI-OUC and MDC-OUC ($N = 128$). h MNIST image classification accuracy obtained by two-layer 128-input ONN for both cases ($N = 128$).
  • Figure 3: Fabricated silicon photonic 32-input MVM PIC. a Microscope image of the entire PIC. b Array of 32 MZMs. c Array of 32 PDs. d 32-port directional coupler inside an MDC-OUC. e SEM image at the input of a 32-port directional coupler, marked by white broken-line box in d. f Packaged ONN PIC module for laboratory testing.
  • Figure 4: Experimental results. a Characterization of phase shifters: output power of test MZM against driving voltage (left panel) and the histograms of the electrical resistance (right top panel) and the phase shifting coefficient (right bottom panel) measured for 226 phase shifters. In the histograms, $\mu$ and $\sigma$ denote the mean value and standard deviation, respectively. b Characterization of a test PD (left panel) and the histogram of the responsivity measured for all 32 PDs (right panel). c Experimental results of iris flower classification. Test accuracy reaches 100% for 30 testing data out of 150 data in the whole dataset. d Experimental results of wine classification. Test accuracy reaches 91.7% for 36 testing data out of 178 data in total. e Experimental results for binary classification of handwritten digits 0 and 1 from the MNIST dataset. A subset of input images, the training and validation accuracy curves, and the final classification output are shown. A test accuracy of 97.7% is achieved on 300 testing samples. f Experimental results for binary classification of handwritten digits 0 and 6 from the MNIST dataset. Test accuracy of 90.3% is achieved on 300 testing samples.
  • Figure 5: Benchmark comparison of this work against previously reported reconfigurable MVM PICs. See Table \ref{['tab1:benchmark']} for exact values.
  • ...and 13 more figures