Table of Contents
Fetching ...

Metasurface-based all-optical diffractive convolutional neural networks

Zhijiang Liang, Chenxuan Xiang, Shuyuan Xiao, Jumin Qiu, Jie Li, Qiegen Liu, Chengjun Zou, Tingting Liu

TL;DR

The paper tackles energy and parallelism bottlenecks in electronic CNNs by proposing an all-optical neural network architecture. MAODCNN integrates a phase-only metasurface convolutional layer with cascaded diffractive metasurfaces to perform end-to-end optical feature extraction and inference on images. Training uses backpropagation in the optical domain on MNIST and Fashion-MNIST, with depth (more diffractive layers) yielding up to a $20\%$ accuracy improvement over a single layer and outperforming a conventional DNN by up to $15\%$ on MNIST. The results indicate a practical, low-power, all-optical platform and point to future directions such as joint phase-amplitude modulation and chip-scale optoelectronic integration for edge-enabled optical AI hardware.

Abstract

The escalating energy demands and parallel-processing bottlenecks of electronic neural networks underscore the need for alternative computing paradigms. Optical neural networks, capitalizing on the inherent parallelism and speed of light propagation, present a compelling solution. Nevertheless, physically realizing convolutional neural network (CNN) components all-optically remains a significant challenge. To this end, we propose a metasurface-based all-optical diffractive convolutional neural network (MAODCNN) for computer vision tasks. This architecture synergistically integrates metasurface-based optical convolutional layers, which perform parallel convolution on the optical field, with cascaded diffractive neural networks acting as all-optical decoders. This co-design facilitates layer-wise feature extraction and optimization directly within the optical domain. Numerical simulations confirm that the fusion of convolutional and diffractive layers markedly enhances classification accuracy, a performance that scales with the number of diffractive layers. The MAODCNN framework establishes a viable foundation for practical all-optical CNNs, paving the way for high-efficiency, low-power optical computing in advanced pattern recognition.

Metasurface-based all-optical diffractive convolutional neural networks

TL;DR

The paper tackles energy and parallelism bottlenecks in electronic CNNs by proposing an all-optical neural network architecture. MAODCNN integrates a phase-only metasurface convolutional layer with cascaded diffractive metasurfaces to perform end-to-end optical feature extraction and inference on images. Training uses backpropagation in the optical domain on MNIST and Fashion-MNIST, with depth (more diffractive layers) yielding up to a accuracy improvement over a single layer and outperforming a conventional DNN by up to on MNIST. The results indicate a practical, low-power, all-optical platform and point to future directions such as joint phase-amplitude modulation and chip-scale optoelectronic integration for edge-enabled optical AI hardware.

Abstract

The escalating energy demands and parallel-processing bottlenecks of electronic neural networks underscore the need for alternative computing paradigms. Optical neural networks, capitalizing on the inherent parallelism and speed of light propagation, present a compelling solution. Nevertheless, physically realizing convolutional neural network (CNN) components all-optically remains a significant challenge. To this end, we propose a metasurface-based all-optical diffractive convolutional neural network (MAODCNN) for computer vision tasks. This architecture synergistically integrates metasurface-based optical convolutional layers, which perform parallel convolution on the optical field, with cascaded diffractive neural networks acting as all-optical decoders. This co-design facilitates layer-wise feature extraction and optimization directly within the optical domain. Numerical simulations confirm that the fusion of convolutional and diffractive layers markedly enhances classification accuracy, a performance that scales with the number of diffractive layers. The MAODCNN framework establishes a viable foundation for practical all-optical CNNs, paving the way for high-efficiency, low-power optical computing in advanced pattern recognition.

Paper Structure

This paper contains 4 sections, 7 figures, 1 table.

Figures (7)

  • Figure 1: Architecture of the metasurface-based all-optical diffractive convolutional neural network (MAODCNN). (a) Schematic of the full process of all-optical classification, from optical input to the final decision at the detector array. (b) Design of the metasurface convolutional layer, which performs parallel convolution by tiling a single kernel design (comprising $n$ sub-segments) across the aperture. (c) Configuration of the cascaded diffractive metasurface layers and the output detector plane.
  • Figure 2: Metasurface unit cell design and optical characterization. (a) Schematic of a TiO$_{2}$ metasurface unit cell, showing the fixed period ($P$) and height ($H$), and tunable lateral dimensions ($D_{x}$, $D_{y}$) that define its optical response. (b), (c) FDTD-simulated transmission amplitude (b) and phase shift (c) as functions of $D_{x}$ and $D_{y}$ at a wavelength of 532 nm ($P=400$ nm, $H=600$ nm).
  • Figure 3: Performance evaluation of MAODCNN on MNIST and Fashion-MNIST datasets. (a) Classification accuracy on MNIST as a function of the number of hidden layers and output classes. (b) Comparative classification accuracy between a conventional DNN and the proposed MAODCNN on the 10-class MNIST task. (c) Confusion matrix of the MAODCNN's predictions on the Fashion-MNIST test set. (d) Classification accuracy on MNIST versus the occupancy rate of a single detection region for DNN and MAODCNN.
  • Figure 4: Impact of amplitude crosstalk on the MAODCNN performance. (a) Output light field distribution of the purely phase-modulated MAODCNN. (b) Output light field distribution of the MAODCNN incorporating amplitude crosstalk. (c) Recognition error between the phase-only network and the network with amplitude crosstalk across the test set. (d) Normalized energy distribution for each digit class ("0"-"9") in the MAODCNN with amplitude crosstalk, demonstrating minimal deviation (error bars 3%) from the phase-only network.
  • Figure 5: Representative output patterns and class activation. Simulated output optical patterns and the corresponding normalized energy distribution across the 10 detector regions for (a) an input handwritten digit "3" (MNIST) and (b) an input "shoe" (Fashion-MNIST). The white squares highlight the activated detector region corresponding to the correct class.
  • ...and 2 more figures