Hardware-In-The-Loop Training of a 4f Optical Correlator with Logarithmic Complexity Reduction for CNNs

Lorenzo Pes; Maryam Dehbashizadeh Chehreghan; Rick Luiken; Sander Stuijk; Ripalta Stabile; Federico Corradi

Hardware-In-The-Loop Training of a 4f Optical Correlator with Logarithmic Complexity Reduction for CNNs

Lorenzo Pes, Maryam Dehbashizadeh Chehreghan, Rick Luiken, Sander Stuijk, Ripalta Stabile, Federico Corradi

TL;DR

The paper investigates hardware-in-the-loop training of a $4f$ optical correlator for CNN-style classification on a MNIST subset, comparing traditional backpropagation (BP) with the forward-only PEPITA algorithm. It demonstrates that PEPITA can achieve near-identical accuracy to BP while reducing training computational complexity from $O(n^2 \log n)$ to $O(n^2)$ and without requiring a differentiable device model. Using an optical setup with $8$ Fourier kernels and a simple CNN-like architecture, the study reports BP attaining $88.8 \pm 4$ and PEPITA $87.6 \pm 3$ on 600 training samples and 100 test samples, with an SSIM around $0.8$ between software and optical results. The work highlights practical bottlenecks (SLM-driven throughput) and points toward FPGA-based parallelism as a path to scaling to larger datasets.

Abstract

This work evaluates a forward-only learning algorithm on the MNIST dataset with hardware-in-the-loop training of a 4f optical correlator, achieving 87.6% accuracy with O(n2) complexity, compared to backpropagation, which achieves 88.8% accuracy with O(n2 log n) complexity.

Hardware-In-The-Loop Training of a 4f Optical Correlator with Logarithmic Complexity Reduction for CNNs

TL;DR

The paper investigates hardware-in-the-loop training of a

optical correlator for CNN-style classification on a MNIST subset, comparing traditional backpropagation (BP) with the forward-only PEPITA algorithm. It demonstrates that PEPITA can achieve near-identical accuracy to BP while reducing training computational complexity from

and without requiring a differentiable device model. Using an optical setup with

Fourier kernels and a simple CNN-like architecture, the study reports BP attaining

and PEPITA

on 600 training samples and 100 test samples, with an SSIM around

between software and optical results. The work highlights practical bottlenecks (SLM-driven throughput) and points toward FPGA-based parallelism as a path to scaling to larger datasets.

Abstract

Paper Structure (6 sections, 2 figures)

This paper contains 6 sections, 2 figures.

Introduction
Forward-only training of optical devices
Setup and experiment
Results and Discussion
Conclusion
Acknowledgment

Figures (2)

Figure 1: HWL with 4f Optical Correlator. a Physical 4f correlator. b Full system view with input and output processing. c HWL training with BP. d HWL training with PEPITA. In both c and d, the blue path represents the physical device. In red are the computational complexity of the update rules represented by the orange equations.
Figure 2: HWL vs. software training results. a Mean training loss and train accuracy per epoch for each algorithm with HWL with 300 MNIST samples. b Software versus (experimental) optical convolution with an edge detection kernel. The average SSIM is reported inside. c End of training mean test accuracy in optical device. d Total FLOPS for BP and PEPITA. Approximate FLOPS for BP are computed without accounting for $dL/d z_{hw}$ from downstream layers, which would further increase the total FLOPS.

Hardware-In-The-Loop Training of a 4f Optical Correlator with Logarithmic Complexity Reduction for CNNs

TL;DR

Abstract

Hardware-In-The-Loop Training of a 4f Optical Correlator with Logarithmic Complexity Reduction for CNNs

Authors

TL;DR

Abstract

Table of Contents

Figures (2)