Table of Contents
Fetching ...

Tutorial: A practical guide to the alignment of defocused spatial light modulators for fast diffractive neural networks

Guillaume Noetinger, Tim Tuuva, Romain Fleury

TL;DR

This work demonstrates a scalable, defocused-conjugation approach to align multiple spatial light modulators (SLMs) for optical diffractive neural networks, enabling high-throughput, parallel processing of hundreds of inputs. The authors develop a semi-automatic alignment protocol based on edge-diffraction and ellipse-detection, achieving pixel-level conjugation across ROIs and providing a mapping suitable for batch training. They characterize alignment precision, model the PSF with Huygens-Fresnel theory, and examine the impact of a diaphragm on high-frequency noise and speckle grain, as well as transmission-matrix measurements for refocusing. The study shows substantial speedups in training time via spatial multiplexing and demonstrates noise reduction through averaging, while highlighting challenges such as parasitic reflections that hinder backpropagation and the need for further advances in robust optical training methods. The results offer a practical pathway to scalable optical DNNs with potential applications in wavefront control, imaging, and optical computing, while outlining concrete avenues for improving alignment, TM-based optimization, and training strategies.

Abstract

The conjugation of multiple spatial light modulators (SLMs) enables the construction of optical diffractive neural networks (DNNs). To accelerate training, limited by the low refresh rate of SLMs, spatial multiplexing of the input data across different spatial channels is possible maximizing the number of available spatial degrees of freedom (DoFs). Precise alignment is required in order to ensure that the same physical operation is performed across each channel. We present a semi-automatic procedure for this experimentally challenging alignment resulting in a pixel-level conjugation. It is scalable to any number of SLMs and may be useful in wavefront shaping setups where precise conjugation of SLMs is required, e.g. for the control of optical waves in phase and amplitude. The resulting setup functions as an optical DNN able to process hundreds of inputs simultaneously, thereby reducing training times and experimental noise through spatial averaging. We further present a characterization of the setup and an alignment method.

Tutorial: A practical guide to the alignment of defocused spatial light modulators for fast diffractive neural networks

TL;DR

This work demonstrates a scalable, defocused-conjugation approach to align multiple spatial light modulators (SLMs) for optical diffractive neural networks, enabling high-throughput, parallel processing of hundreds of inputs. The authors develop a semi-automatic alignment protocol based on edge-diffraction and ellipse-detection, achieving pixel-level conjugation across ROIs and providing a mapping suitable for batch training. They characterize alignment precision, model the PSF with Huygens-Fresnel theory, and examine the impact of a diaphragm on high-frequency noise and speckle grain, as well as transmission-matrix measurements for refocusing. The study shows substantial speedups in training time via spatial multiplexing and demonstrates noise reduction through averaging, while highlighting challenges such as parasitic reflections that hinder backpropagation and the need for further advances in robust optical training methods. The results offer a practical pathway to scalable optical DNNs with potential applications in wavefront control, imaging, and optical computing, while outlining concrete avenues for improving alignment, TM-based optimization, and training strategies.

Abstract

The conjugation of multiple spatial light modulators (SLMs) enables the construction of optical diffractive neural networks (DNNs). To accelerate training, limited by the low refresh rate of SLMs, spatial multiplexing of the input data across different spatial channels is possible maximizing the number of available spatial degrees of freedom (DoFs). Precise alignment is required in order to ensure that the same physical operation is performed across each channel. We present a semi-automatic procedure for this experimentally challenging alignment resulting in a pixel-level conjugation. It is scalable to any number of SLMs and may be useful in wavefront shaping setups where precise conjugation of SLMs is required, e.g. for the control of optical waves in phase and amplitude. The resulting setup functions as an optical DNN able to process hundreds of inputs simultaneously, thereby reducing training times and experimental noise through spatial averaging. We further present a characterization of the setup and an alignment method.

Paper Structure

This paper contains 20 sections, 3 equations, 12 figures.

Figures (12)

  • Figure 1: An optical diffractive neural network(a) The wave associated to the image of a given digit is sent on the corresponding detector by properly tuning the propagation media, forming a diffractive processor [Ozcan] (b) A multi-plane light converter (MPLC) can perform an equivalent transformation for a light field but at the expense of a complicated alignment, low yield and the sacrifice of many degrees of freedom (DoFs). (c) The proposed setup consist in conjugated amplitude and phase modulators sent to a defocused camera at a distance $\Delta z$ from the image plane, the resulting intensity is sent again onto the setup to be modulated again. Given the typical machine learning image size compared to the SLM resolution, multiple inputs can be displayed at the same time (here $3\times3$ images, $10 \times 10$ in our experiments) to speed up the acquisition.
  • Figure 2: The setup. An amplitude modulator (the $\mu$Display) is conjugated to a phase modulator (the SLM) using a $4f$ system. The resulting field is sent to a defocused camera after filtering by a diaphragm $D$. The polarization is indicated by the green arrows. Only the horizontal polarization is modified by the SLM (green letter $M$). The remaining vertical polarization is filtered with a polarizer. The field on the SLM is the convolution of the µDisplay's image and a point spread function (PSF) $PSF_1$, the propagation to the defocused camera corresponds to a convolution by $PSF_{\Delta z}$.
  • Figure 3: The alignment process. (a) The SLM is replaced by a camera to locate the position of the µDisplay's image plane and adjust the magnification of the 4f system. (b) The magnification of the 4f system is set by displaying a circle of known width on the µDisplay and measuring the resulting image width on the camera. Manual alignments(c) Lateral positioning: targets displayed on the SLM are manually aligned with square of uniform intensities on the µDisplay. (d) Axial positioning: when the SLM is conjugated to the µDisplay, clear images of resolution targets are observed on both modulators. Automated alignments: (e) Disks of uniform intensity are displayed on the µDisplay. To avoid any overlap if the camera is defocused and minimize the flux, only one disk out of four is displayed at a time so that each disk's image never overlaps with their neighbors. Examples of the successive position of the disks for the four acquisitions are shown as yellow circles here. (f) Disks of alternating indices displayed on the SLM appears opaque on the camera due to wave diffusion by sharp edges and can be detected by standard ellipse fitting algorithm. Examples of the successive position of the disks for the four acquisitions are shown as blue circles here. Results:(g) Resulting calibrated surface: Each zone of the µDisplay (in red) and SLM could not be perfectly aligned manually due to limited precision in translation and rotation.(h) They match almost perfectly after using our algorithm.
  • Figure 4: Comparison of the position of the speckle-like outputs before and after the automated alignment process.(a) An example of raw speckle patterns and the average image. After calibration, the average image is a speckle-like pattern (right) whereas before calibration the sum averages to a square corresponding to the illuminated zone (left). (b) The number of shifts of each RoI compared to the central image measured with cross-correlation. The yellow cross indicates where the autocorrelation becomes irrelevant because the speckles patterns are too different. (c) A typical histogram of shifts: before the calibration, the images are shifted by up to 5 pixels (in blue). After calibration the images are shifted by less than 3 pixels, with the vast majority being shifted by 0 or 1 pixel (in orange).
  • Figure 5: Effect of the multiplexing on the noise(a) Experimental acquisition of MNIST digits on the setup in a conjugated configuration. (b) MSE to the mean image for each RoI. (c) Evolution of the noise with the number of images taken into account for averaging with standard deviation and fit.
  • ...and 7 more figures