Nonlinear optical encoding enabled by recurrent linear scattering

Fei Xia; Kyungduk Kim; Yaniv Eliezer; SeungYun Han; Liam Shaughnessy; Sylvain Gigan; Hui Cao

Nonlinear optical encoding enabled by recurrent linear scattering

Fei Xia, Kyungduk Kim, Yaniv Eliezer, SeungYun Han, Liam Shaughnessy, Sylvain Gigan, Hui Cao

TL;DR

This work tackles the challenge of achieving optical nonlinearity by leveraging a passive, tunable nonlinear mapping inside a reconfigurable multiple-scattering cavity. A DMD-controlled scattering potential in an integrating sphere creates high-order nonlinear features without relying on active nonlinear optical materials or high-power pumping; a lightweight digital decoder then extracts information from a small set of optical modes. The approach delivers strong performance across tasks—FashionMNIST classification, image reconstruction, keypoint detection, and real-time pedestrian detection—at extreme optical compression (up to $3072:1$) with high information content per mode, as shown by mutual information analyses. It points to a scalable, energy-efficient pathway for optical computing and fast data analytics, with potential implications for sensing, imaging, and autonomous systems.

Abstract

Optical information processing and computing can potentially offer enhanced performance, scalability and energy efficiency. However, achieving nonlinearity-a critical component of computation-remains challenging in the optical domain. Here we introduce a design that leverages a multiple-scattering cavity to passively induce optical nonlinear random mapping with a continuous-wave laser at a low power. Each scattering event effectively mixes information from different areas of a spatial light modulator, resulting in a highly nonlinear mapping between the input data and output pattern. We demonstrate that our design retains vital information even when the readout dimensionality is reduced, thereby enabling optical data compression. This capability allows our optical platforms to offer efficient optical information processing solutions across applications. We demonstrate our design's efficacy across tasks, including classification, image reconstruction, keypoint detection and object detection, all of which are achieved through optical data compression combined with a digital decoder. In particular, high performance at extreme compression ratios is observed in real-time pedestrian detection. Our findings open pathways for novel algorithms and unconventional architectural designs for optical computing.

Nonlinear optical encoding enabled by recurrent linear scattering

TL;DR

) with high information content per mode, as shown by mutual information analyses. It points to a scalable, energy-efficient pathway for optical computing and fast data analytics, with potential implications for sensing, imaging, and autonomous systems.

Abstract

Paper Structure (6 sections, 2 equations, 4 figures)

This paper contains 6 sections, 2 equations, 4 figures.

Nonlinear random mapping with tunable nonlinearity
Enhanced image classification
Demonstration with complex tasks
Image reconstruction
Keypoint detection
Real-time video analytics

Figures (4)

Figure 1: Concept of using a multiple-scattering cavity as a passive, tunable nonlinear optical information processor: (a) The experimental setup, in which the key component for creating the passive nonlinear random mapping is a DMD mounted on an integrating sphere. The output of the cavity produces a fully developed speckle pattern, with its response being nonlinear in the geometric configuration of the DMD; (b) Representative figure showing the cavity essentially encodes the input pattern on the DMD through optically mixing different areas of input through multiple bounces to create a highly nonlinear feature -- a speckle recorded by a camera. (Input pattern is adapted from MNIST dataset lecun1998mnist.) (c) The mathematical representation of a nonlinear mapping process that transforms a set of input elements on the DMD into a collection of nonlinear features in the output speckle pattern. Multiple scattering in the cavity generates mixed terms of the input valeus at different pixels with various high nonlinear orders, which provide rich nonlinear features that can be optimally trained to enhance performance in complex computational tasks. $f(x)$ denotes the operation of scaling the configuration of a DMD macropixel $x_{i,j}$.
Figure 2: Classification with nonlinear mapping. (a) Training data from the FashionMNIST datasets are used to train a 1-layer neural network as a digital decoder for classification tasks. Additionally, the percentage of the modulated area on the DMD is changed among 6.25$\%$, 25$\%$, 100$\%$ to adjust the order of nonlinear mapping. With full (100$\%$) modulation of DMD, nonlinear order is further enhanced by covering the output port with a partial reflector (silicon wafer). (b) FashionMNIST classification results with a linear classifier are presented under different numbers of output modes (speckle grains) and varying nonlinear strength. The "optical linear features with quadratic detection" are simulated by scattering from a single layer with intensity detection to create a quadratic nonlinear response. Note that a linear regression for binarized Fashion-MNIST data cannot exceed 77.6% with the same number of modes. (c,d) Violin plots representing the distributions of mutual information between the speckle grains and classification targets under varying numbers of output modes in (c), and differing order of nonlinear mapping by changing the modulated area on DMD or partially closing the cavity (enhanced) in (d). For $n$ speckle mode ($n$ on the $x$-axis), $4n$ replicated measurements from the same input were performed in (c) and (d). The dashed line plots depict the median values of the mutual information. Each violin's width reflects the distribution of the mutual information values of the speckle grains and its probability density. Within each violin: the slim black vertical line represents the range of minimal and maximal values; the black box represents the first to third percentile; the white dot represents the median. (c) Mutual information analysis when the number of output modes (speckle grains) varies under the highest-order nonlinear mapping. (d) Mutual information analysis with low-dimensional speckle features (4 output modes) for FashionMNIST as a function of the nonlinear orders varied by modulated area on DMD, showing the advantage of going to higher-order nonlinear mapping.
Figure 3: Computing performance enhanced by nonlinear optical data compression: Concept of image reconstruction using (a) linear optical complex media for linear encoding and camera detection with quadratic response; (b) reconstruction from speckle features from (a); orange boxes represent the wrong reconstructed pairs; (c) the multiple-scattering cavity as a nonlinear optical encoder also with camera detection and employing compressed speckle features for digital reconstruction of the original image data. (d) Reconstruction from speckle features generated by the multiple-scattering cavity. In (b,d), approximately 25 speckle grains are used with a compression ratio of 31:1 and are used to train two digital decoders (see details in Methods). It is demonstrated that, given the same number of compressed output modes (speckle grains), nonlinear features generated from the cavity can provide a reduced mean squared error by 0.6, resulting in a better reconstruction of the images in (d) compared to (b). More results included in the Supplementary Figs. S4-6. (e) Concept of keypoint detection in human faces (images with 96 $\times$ 96 pixels) with compressed speckle features (f) keypoint detection with mode compression ratio of 576:1, using 16 output modes with relatively weaker nonlinearity (25$\%$ modulated areas in DMD) and a 5-layer MLP decoder (g) improved keypoint detection with a reduced mean error in pixels across 15 keypoints (1.06 pixels compared to 1.86 pixels errors in (f)), using 16 output modes (speckle grains) with relatively stronger nonlinearity (full modulated areas in DMD) and an 9-layer MLP decoder.
Figure 4: Real-time video pedestrian detection in driving with high mode compression ratio using only 25 output modes: (a) Schematic representation of real-time pedestrian detection using video data from a dash camera during driving. The multiple-scattering cavity functions as an optical data compressor, and compressed nonlinear optical features are utilized for pedestrian detection with a digital decoder. (b) Demonstration of pedestrian detection at close to a real-time video rate. The magenta boxes represent the inference results from the speckle. The green boxes represent the ground truth. The speed of optical processing, that is, nonlinear feature generation, is as fast as light, and its readout speed is limited by only the camera. With only 25 modes, our camera can currently reach at least 800 Hz. The inference time with the 25 modes in pedestrian detection is 0.0035 s, leading to a total response time (inference + generation of optical features) of less than 0.1 s, which is faster than the typical human response time of 0.2$\sim$22 s. The error unit is in pixels (px). (c) Demonstration of pedestrian detection at various locations during continuous video streaming; the mean detection error with only 25 modes remains within 1.92 pixels (px).

Nonlinear optical encoding enabled by recurrent linear scattering

TL;DR

Abstract

Nonlinear optical encoding enabled by recurrent linear scattering

Authors

TL;DR

Abstract

Table of Contents

Figures (4)