Metasurface-based all-optical diffractive convolutional neural networks
Zhijiang Liang, Chenxuan Xiang, Shuyuan Xiao, Jumin Qiu, Jie Li, Qiegen Liu, Chengjun Zou, Tingting Liu
TL;DR
The paper tackles energy and parallelism bottlenecks in electronic CNNs by proposing an all-optical neural network architecture. MAODCNN integrates a phase-only metasurface convolutional layer with cascaded diffractive metasurfaces to perform end-to-end optical feature extraction and inference on images. Training uses backpropagation in the optical domain on MNIST and Fashion-MNIST, with depth (more diffractive layers) yielding up to a $20\%$ accuracy improvement over a single layer and outperforming a conventional DNN by up to $15\%$ on MNIST. The results indicate a practical, low-power, all-optical platform and point to future directions such as joint phase-amplitude modulation and chip-scale optoelectronic integration for edge-enabled optical AI hardware.
Abstract
The escalating energy demands and parallel-processing bottlenecks of electronic neural networks underscore the need for alternative computing paradigms. Optical neural networks, capitalizing on the inherent parallelism and speed of light propagation, present a compelling solution. Nevertheless, physically realizing convolutional neural network (CNN) components all-optically remains a significant challenge. To this end, we propose a metasurface-based all-optical diffractive convolutional neural network (MAODCNN) for computer vision tasks. This architecture synergistically integrates metasurface-based optical convolutional layers, which perform parallel convolution on the optical field, with cascaded diffractive neural networks acting as all-optical decoders. This co-design facilitates layer-wise feature extraction and optimization directly within the optical domain. Numerical simulations confirm that the fusion of convolutional and diffractive layers markedly enhances classification accuracy, a performance that scales with the number of diffractive layers. The MAODCNN framework establishes a viable foundation for practical all-optical CNNs, paving the way for high-efficiency, low-power optical computing in advanced pattern recognition.
