Towards the Information-Theoretic Limit of Programmable Photonics

Ryan Hamerly; Jasvith Raj Basani; Alexander Sludds; Sri Krishna Vadlamani; Dirk Englund

Towards the Information-Theoretic Limit of Programmable Photonics

Ryan Hamerly, Jasvith Raj Basani, Alexander Sludds, Sri Krishna Vadlamani, Dirk Englund

TL;DR

This work establishes an information-theoretic limit on the average phase shift required for universal programmable photonic circuits, showing a fundamental $O(1/\sqrt{N})$ scaling with circuit size. It then demonstrates that a 3-MZI mesh can approach this limit within a factor of about 2, achieving a practical $\sim$10× reduction in average phase shift compared to traditional MZI meshes, and proves that non-unitary (Gaussian) targets can saturate the bound using crossbar architectures. The authors also show that optical neural networks can be trained with all phase shifters constrained to $\lesssim 0.2$ radians without accuracy loss, highlighting the method’s potential for scalable photonic computing. Collectively, the results provide near-optimal, phase-efficient designs for large-scale photonic circuits and new routes for phase-constrained photonic learning and processing.

Abstract

The scalability of many programmable photonic circuits is limited by the $2π$ tuning range needed for the constituent phase shifters. To address this problem, we introduce the concept of a phase-efficient circuit architecture, where the average phase shift is $\ll 2π$. We derive a universal information-theoretic limit to the phase-shift efficiency of universal multiport interferometers, and propose a "3-MZI" architecture that approaches this limit to within a factor of $2\times$, approximately a $10\times$ reduction in average phase shift over the prior art, where the average phase shift scales inversely with system size as $O(1/\sqrt{N})$. For non-unitary circuits, we show that the 3-MZI saturates the theoretical bound for Gaussian-distributed target matrices. Using this architecture, we show optical neural network training with all phase shifters constrained to $\lesssim 0.2$ radians without loss of accuracy.

Towards the Information-Theoretic Limit of Programmable Photonics

TL;DR

This work establishes an information-theoretic limit on the average phase shift required for universal programmable photonic circuits, showing a fundamental

scaling with circuit size. It then demonstrates that a 3-MZI mesh can approach this limit within a factor of about 2, achieving a practical

10× reduction in average phase shift compared to traditional MZI meshes, and proves that non-unitary (Gaussian) targets can saturate the bound using crossbar architectures. The authors also show that optical neural networks can be trained with all phase shifters constrained to

radians without accuracy loss, highlighting the method’s potential for scalable photonic computing. Collectively, the results provide near-optimal, phase-efficient designs for large-scale photonic circuits and new routes for phase-constrained photonic learning and processing.

Abstract

The scalability of many programmable photonic circuits is limited by the

tuning range needed for the constituent phase shifters. To address this problem, we introduce the concept of a phase-efficient circuit architecture, where the average phase shift is

. We derive a universal information-theoretic limit to the phase-shift efficiency of universal multiport interferometers, and propose a "3-MZI" architecture that approaches this limit to within a factor of

, approximately a

reduction in average phase shift over the prior art, where the average phase shift scales inversely with system size as

. For non-unitary circuits, we show that the 3-MZI saturates the theoretical bound for Gaussian-distributed target matrices. Using this architecture, we show optical neural network training with all phase shifters constrained to

radians without loss of accuracy.

Paper Structure (14 sections, 39 equations, 8 figures, 4 tables)

This paper contains 14 sections, 39 equations, 8 figures, 4 tables.

Information-Theoretic Phase Bound
Definitions
Deriving the Bound
Reaching the Bound with MZI Meshes
General Theory
MZI Mesh
3-MZI Mesh
Agreement with Numerics
Saturating the Bound: Gaussian Matrices with Crossbar Meshes
Diamond and PILOSS Crossbar Meshes
Entropy Bound for Random Matrices
$L_{\infty}$-constrained DNN Training
Conclusion
Moments $\langle \theta\rangle_1$, $\langle \theta\rangle_2$, $\Delta\theta_{\rm IQR}$ for the MZI mesh

Figures (8)

Figure 1: (a) Schematic of a generic multiport interferometer, with four concrete implementations: Reck triangle, Clements rectangle, MPLC, and programmable MMI (clockwise from bottom left). (b) Representation of a programmable circuit consisting of discrete phase shifters (colored) and fixed coupler unitaries (gray), with (c) the corresponding design of a Clements mesh, and (d) the mathematical description of the unitary as a product of phase-shift $D(\psi_i)$ and coupler $U_i$ matrices. (e) Related push-pull construction for mesh-like multiport interferometers.
Figure 2: Visualization of the map $f: \vec{\psi} \rightarrow U$ implemented by a multiport interferometer, where the unit volume ${\rm d} V_\psi$ is mapped to a parallelepiped spanned by the gradient vectors $\nabla_{\psi_m} f$ and has a volume ${\rm d} V_U = |\partial(U)/\partial(\psi)| {\rm d} V_\psi$.
Figure 3: (a) $8\times 8$ Clements mesh, which consists of a rectangular array of tunable crossings (the phase screen is omitted for clarity). (b) MZI crossing and its representation by a $2\times 2$ matrix $T$ (Eq. (\ref{['eq:tmat']})) with splitting ratio $s$. (c) $P(s)$ for MZIs of rank 1, 3, 5, and 7 for Haar-uniform unitaries, showing how the distribution becomes concentrated as one moves to the center of the mesh.
Figure 4: (a) Standard MZI crossing, which maps $(\theta, \phi) \rightarrow s$ via polar coordinates. The probability $P(s)$ for an $N = 16$ mesh is plotted to show the concentration near $s = 0$. (b) Probability $P(\theta, \phi)$ as a function of mesh size, showing confinement in $\theta$ but not $\phi$. (c) 3-MZI crossing, whose mapping is locally Cartesian near $s = 0$ and which confines both $\theta$ and $\phi$. (d) Use of fabrication-induced phase offsets to shift the distribution $P(\theta, \phi)$ to center around zero.
Figure 5: (a) Visualization of meshes of size $N = 4$--1024, programmed to realize specific Haar-uniform sampled matrices, where MZIs are colored according to their average phase shift $|\psi| \equiv \tfrac{1}{2}(|\theta| + |\phi|)$. (b) Plot of the phase-shift moments as a function of mesh size (left) and the MZI/3-MZI ratio for each moment type, illustrating the advantage of the latter mesh type (right). (c) Entropy per degree of freedom of the MZI and 3-MZI meshes, as compared to the information-theoretic bound. Both generic and push-pull (PP) bounds are plotted. (d) Phase-shift moments as compared to the bound.
...and 3 more figures

Towards the Information-Theoretic Limit of Programmable Photonics

TL;DR

Abstract

Towards the Information-Theoretic Limit of Programmable Photonics

Authors

TL;DR

Abstract

Table of Contents

Figures (8)