Table of Contents
Fetching ...

Engineering nonlinear activation functions for all-optical neural networks via quantum interference

Ruben Canora, Xinzhe Xu, Ziqi Niu, Hadiseh Alaeian, Shengwang Du

TL;DR

This work tackles the nonlinear activation bottleneck in all-optical neural networks by engineering quantum-interference–based nonlinearities in a three-level atomic medium. The authors develop both lifetime-broadened theory and Doppler-broadened experimental validation in rubidium vapor, achieving sigmoid- and ReLU-like activations at ultralow powers, down to about $17~\mu$W per neuron, with scalability under roughly $20$ W total. They introduce a two-field, two-channel activation scheme enabling true MIMO functionality and demonstrate all-optical gradients suitable for backpropagation. Together, the results point to a practical route for high-speed, energy-efficient optical AI hardware, with potential for on-chip integration and optical training to support networks with millions of neurons.

Abstract

All-optical neural networks (AONNs) promise transformative gains in speed and energy efficiency for artificial intelligence (AI) by leveraging the intrinsic parallelism and wave nature of light. However, their scalability has been fundamentally limited by the high power requirements of conventional nonlinear optical elements. Here, we present a low-power nonlinear activation scheme based on a three-level quantum system driven by dual laser fields. This platform introduces a two-channel nonlinear activation matrix with both self- and cross-nonlinearities, enabling true multi-input, multi-output optical processing. The system supports tunable activation behaviors, including sigmoid and ReLU functions, at ultralow power levels (17 uW per neuron). We validate our approach through theoretical modeling and experimental demonstration in rubidium vapor cells, showing the feasibility of scaling to deep AONNs with millions of neurons operating under 20 W of total optical power. Crucially, we also demonstrate the all-optical generation of gradient-like signals with backpropagation, paving the way for all optical training. These results mark a major advance toward scalable, high-speed, and energy-efficient optical AI hardware.

Engineering nonlinear activation functions for all-optical neural networks via quantum interference

TL;DR

This work tackles the nonlinear activation bottleneck in all-optical neural networks by engineering quantum-interference–based nonlinearities in a three-level atomic medium. The authors develop both lifetime-broadened theory and Doppler-broadened experimental validation in rubidium vapor, achieving sigmoid- and ReLU-like activations at ultralow powers, down to about W per neuron, with scalability under roughly W total. They introduce a two-field, two-channel activation scheme enabling true MIMO functionality and demonstrate all-optical gradients suitable for backpropagation. Together, the results point to a practical route for high-speed, energy-efficient optical AI hardware, with potential for on-chip integration and optical training to support networks with millions of neurons.

Abstract

All-optical neural networks (AONNs) promise transformative gains in speed and energy efficiency for artificial intelligence (AI) by leveraging the intrinsic parallelism and wave nature of light. However, their scalability has been fundamentally limited by the high power requirements of conventional nonlinear optical elements. Here, we present a low-power nonlinear activation scheme based on a three-level quantum system driven by dual laser fields. This platform introduces a two-channel nonlinear activation matrix with both self- and cross-nonlinearities, enabling true multi-input, multi-output optical processing. The system supports tunable activation behaviors, including sigmoid and ReLU functions, at ultralow power levels (17 uW per neuron). We validate our approach through theoretical modeling and experimental demonstration in rubidium vapor cells, showing the feasibility of scaling to deep AONNs with millions of neurons operating under 20 W of total optical power. Crucially, we also demonstrate the all-optical generation of gradient-like signals with backpropagation, paving the way for all optical training. These results mark a major advance toward scalable, high-speed, and energy-efficient optical AI hardware.

Paper Structure

This paper contains 6 sections, 15 equations, 11 figures.

Figures (11)

  • Figure 1: Schematic of the three-level nonlinear optical medium. (a) The energy level diagram of a three-level quantum system with two driving lasers and decay mechanisms, which are considered in this work. (b) Optical setup showing the alignment of two laser beams and their propagation in the medium. (c) Simplified circuit diagram for the two-channel (2-input $\times$ 2-output) nonlinear activation function unit.
  • Figure 2: EIT in the weak probe regime.$\Delta_2=0$. (a) EIT transmission spectra for varying ground-state decoherence rates, with $(\gamma_{12,a}, \gamma_{12,b}, \gamma_{12,c}) = (0.0001, 0.1, 0.5)\,\Gamma_3$, and a fixed control field $\Omega_2 = 5\,\Gamma_3$. (b) EIT nonlinear activation functions with different ground-state decoherence rates $\gamma_{12}$, expressed as the cross-nonlinearity between the probe output $|\Omega_{1,out}|^2$ and the control input $|\Omega_{2}|^2$ [Eq. (\ref{['eq:EIT1']})]. (c) EIT transmission spectra for different control Rabi frequencies $\Omega_2 = (0, 5, 10)\,\Gamma_3$ at fixed $\gamma_{12} = 0.03\,\Gamma_3$. Green markers denote selected probe detunings $(\Delta_{1,a}, \Delta_{1,b}, \Delta_{1,c}) = (0.5, 1.5, 2.5)\,\Gamma_3$. (d) Corresponding nonlinear activation functions for the three detuning values shown in (c), illustrating tunable sigmoid-like response.
  • Figure 3: Power effect on the resonant probe transmission ($\Delta_1=\Delta_2=0$). Control-probe nonlinear functions with different probe inputs $(\Omega_{1,a}, \Omega_{1,b}, \Omega_{1,c}, \Omega_{1,d})=(10^{-5}, 1, 3, 10) \Gamma_3$ for (a) $\Gamma_{12}=0.1\Gamma_3$ and (b) $\Gamma_{12}=\Gamma_3$.
  • Figure 4: Nonlinear optical activation functions with comparable driving fields.$\Delta_2 = 0$. (a) Control-probe cross nonlinear activation functions between $|\Omega_{1,out}|^2$ and $|\Omega_{2,in}|^2$ with different probe detuning: $\Delta_1=(1/3, ~ 2/3, ~1) \Gamma_3$. Other parameters: $\Omega_{1,\mathrm{in}}=\Gamma_3$, and $\Gamma_{12}=0$. (b) Control-probe cross nonlinear activation functions between $|\Omega_{1,\mathrm{out}}|^2$ and $|\Omega_{2,\mathrm{in}}|^2$ with different ground state population transfer rate: $\Gamma_{12} = (0.01,~ 0.1,~ 1) \Gamma_3$. Other parameters: $\Omega_1 = \Gamma_3$, and $\Delta_1 = 1/3 \Gamma_3$. (c) Probe self nonlinear activation functions with different control powers: $(\Omega_{2,a},\Omega_{2,b},\Omega_{2,b})=(1,~3, ~10) \Gamma_3$. Other parameters: $\Delta_1 = \Omega_2/2$, and $\Gamma_{12}=0$. (d) Probe self nonlinear activation functions with different ground state population transfer rate: $(\Gamma_{12,a},\Gamma_{12,b},\Gamma_{12,c}) = (0.01,~ 0.1,~ 1) \Gamma_3$. Other parameters are $\Omega_2=\Gamma_3$, and $\Delta_1=\Omega_2/2$.
  • Figure 5: Two-channel (2-input $\times$ 2-output) nonlinear activation functions with self- and cross-nonlinearity in a lifetime-broadened atomic medium. (a) Self- and (b) cross-nonlinearity of input 1, respectively. (c) Cross- and (d) self-nonlinearity of input 2, respectively. We set $\Omega_{1,\mathrm{in}}=\Gamma_3$ in (b) and $\Omega_{2,\mathrm{in}}=\Gamma_3$ in (c). $\Delta_1 = - \Delta_2 = 1/3\Gamma_3$.
  • ...and 6 more figures