Table of Contents
Fetching ...

Embedding Matrices in Programmable Photonic Networks with Flexible Depth and Width

Matthew Markowitz, Kevin Zelaya, Mohammad-Ali Miri

TL;DR

This work derives a general relation for the width and depth of a network that guarantees representing all N × N complex-valued matrix operations and promises a more adaptable and scalable route to photonic matrix processors.

Abstract

We show that programmable photonic circuit architectures composed of alternating mixing layers and active layers offer a high degree of flexibility. This alternating configuration enables the systematic tailoring of both the network's depth (number of layers) and width (size of each layer) without compromising computational capabilities. From a mathematical perspective, our approach can be viewed as embedding an arbitrary target matrix into a higher-dimensional matrix, which can then be represented with fewer layers and larger active elements. We derive a general relation for the width and depth of a network that guarantees representing all $N \times N$ complex matrix operations. Remarkably, we show that just two such active layers, interleaved with passive mixing layers, are sufficient to universally implement arbitrary matrix transformations. This result promises a more adaptable and scalable route to photonic matrix processors.

Embedding Matrices in Programmable Photonic Networks with Flexible Depth and Width

TL;DR

This work derives a general relation for the width and depth of a network that guarantees representing all N × N complex-valued matrix operations and promises a more adaptable and scalable route to photonic matrix processors.

Abstract

We show that programmable photonic circuit architectures composed of alternating mixing layers and active layers offer a high degree of flexibility. This alternating configuration enables the systematic tailoring of both the network's depth (number of layers) and width (size of each layer) without compromising computational capabilities. From a mathematical perspective, our approach can be viewed as embedding an arbitrary target matrix into a higher-dimensional matrix, which can then be represented with fewer layers and larger active elements. We derive a general relation for the width and depth of a network that guarantees representing all complex matrix operations. Remarkably, we show that just two such active layers, interleaved with passive mixing layers, are sufficient to universally implement arbitrary matrix transformations. This result promises a more adaptable and scalable route to photonic matrix processors.

Paper Structure

This paper contains 3 equations, 5 figures.

Figures (5)

  • Figure 1: Operational graph of the conventional SVD architecture (a) and proposed flexible two-layer architecture (b). The mixing layers $F$ alternate with diagonal phase layers with elements $d_{p}^{(m)}$, with $p\in\{1,\ldots,K\}$ and $m\in\{1,2\}$. $F$ can be realized using MMI couplers or coupled waveguide arrays (c). The active diagonal layer can comprise lossy elements, such as MZIs or PCMs, or lossless phase shifters (d). The intensity of the desired $N \times N$ target matrix (e) is encoded into the two-layer $K \times K$ system (f).
  • Figure 2: Error norm $\textnormal{log}_{10}(L)$ for 100 random targets and embedded matrices of size $N=3,4,5$ and various choices of $K$. The number of layers is fixed to $M=2$ (a) and $M=3$ (b). In each case, the upper and lower panels show the error norms for the non-unitary ($\vert d^{(m)}_{p}\vert\neq 1$) and unitary ($\vert d^{(m)}_{p}\vert= 1$) designs, respectively. Complex-valued $\mathbb{C}^{N\times N}$ targets are considered during the optimization, combined with the DFT and DFrFT as the mixing layers $F$. The number next to the error bars denotes the corresponding power-loss factor $\alpha$.
  • Figure 3: Loss-function (log$_{10}(L)$) dependence on the scaling parameter $\alpha$ for the targets in $\mathbb{C}^{4\times 4}$. The unitary architecture is considered, with the DFT as the passive $F$ layer. Only values of $K>K_{c}$ above the critical value are considered for $M=2$ and $M=3$ layers. For the analysis, 50 targets are chosen per $\alpha\in[1,3]$, from which the mean (curve) and standard deviation (shaded area) are displayed.
  • Figure 4: Architecture comparison and scalability. Proposed architecture (lossy layer) with $K=9$ ports and $M=2$ layers (a), and $K=5$ with $M=3$ layers (c), used to reshape a general-purpose $4\times 4$ matrix. Relevant quantities characterizing the waveguide array (c) and MZI (d). (e) Complex-valued matrix implementation reported in markowitz2024learning. (f) Complex-valued matrix implementation through SVD through two unitaries. Figs. 4(a)-(b),(e)-(d) are scaled with the same aspect ratio and relevant design dimensions.
  • Figure 5: Different architectures scalability shown in terms of the total horizontal $L_{x}$ and vertical $L_{y}$ lengths illustrated in the upper-left and lower-left panels, respectively. The total area is plotted in the right panel.