Table of Contents
Fetching ...

The Spatial Complexity of Optical Computing and How to Reduce It

Yandong Li, Francesco Monticone

TL;DR

The paper addresses how wave-based optical hardware scales with task complexity and proposes a space-efficient design paradigm that leverages physics-informed sparsity. By defining and exploiting overlapping nonlocality (ONL), it demonstrates two complementary approaches: local sparse ONNs for free-space optics and block-diagonal ONNs for photonic chips, each yielding substantial reductions in device thickness or adaptor components (to as low as 1–10% of conventional designs) with modest accuracy loss. The work introduces BIMT-based training to enforce locality in free-space ONNs, and a two-phase pruning strategy to realize compact block-diagonal architectures on chips, with strong results on standard benchmarks and real-world models like MobileNetV2. These findings offer a practical pathway to space- and energy-efficient optical accelerators, suggesting a balanced trade-off between device size and accuracy and enabling hybrid photonic-electronic edge solutions. Math expresses the core scaling relations, including ONL growth $\,\mathcal{O}(N^{1/2})$ for local sparse kernels and the corresponding impacts on thickness bounds, framing ultimate limits of optical computing as a size-performance trade-off rather than mere metrics.

Abstract

Similar to algorithms, which consume time and memory to run, hardware requires resources to function. For devices processing physical waves, implementing operations needs sufficient "space," as dictated by wave physics. How much space is needed to perform a certain function is a fundamental question in optics, with recent research addressing it for given mathematical operations, but not for more general computing tasks, e.g., classification. Inspired by computational complexity theory, we study the "spatial complexity" of optical computing systems in terms of scaling laws - specifically, how their physical dimensions must scale as the dimension of the mathematical operation increases - and propose a new paradigm for designing optical computing systems: space-efficient neuromorphic optics, based on structural sparsity constraints and neural pruning methods motivated by wave physics (notably, the concept of "overlapping nonlocality"). On two mainstream platforms, free-space optics and on-chip integrated photonics, our methods demonstrate substantial size reductions (to 1%-10% the size of conventional designs) with minimal compromise on performance. Our theoretical and computational results reveal a trend of diminishing returns on accuracy as structure dimensions increase, providing a new perspective for interpreting and approaching the ultimate limits of optical computing - a balanced trade-off between device size and accuracy.

The Spatial Complexity of Optical Computing and How to Reduce It

TL;DR

The paper addresses how wave-based optical hardware scales with task complexity and proposes a space-efficient design paradigm that leverages physics-informed sparsity. By defining and exploiting overlapping nonlocality (ONL), it demonstrates two complementary approaches: local sparse ONNs for free-space optics and block-diagonal ONNs for photonic chips, each yielding substantial reductions in device thickness or adaptor components (to as low as 1–10% of conventional designs) with modest accuracy loss. The work introduces BIMT-based training to enforce locality in free-space ONNs, and a two-phase pruning strategy to realize compact block-diagonal architectures on chips, with strong results on standard benchmarks and real-world models like MobileNetV2. These findings offer a practical pathway to space- and energy-efficient optical accelerators, suggesting a balanced trade-off between device size and accuracy and enabling hybrid photonic-electronic edge solutions. Math expresses the core scaling relations, including ONL growth for local sparse kernels and the corresponding impacts on thickness bounds, framing ultimate limits of optical computing as a size-performance trade-off rather than mere metrics.

Abstract

Similar to algorithms, which consume time and memory to run, hardware requires resources to function. For devices processing physical waves, implementing operations needs sufficient "space," as dictated by wave physics. How much space is needed to perform a certain function is a fundamental question in optics, with recent research addressing it for given mathematical operations, but not for more general computing tasks, e.g., classification. Inspired by computational complexity theory, we study the "spatial complexity" of optical computing systems in terms of scaling laws - specifically, how their physical dimensions must scale as the dimension of the mathematical operation increases - and propose a new paradigm for designing optical computing systems: space-efficient neuromorphic optics, based on structural sparsity constraints and neural pruning methods motivated by wave physics (notably, the concept of "overlapping nonlocality"). On two mainstream platforms, free-space optics and on-chip integrated photonics, our methods demonstrate substantial size reductions (to 1%-10% the size of conventional designs) with minimal compromise on performance. Our theoretical and computational results reveal a trend of diminishing returns on accuracy as structure dimensions increase, providing a new perspective for interpreting and approaching the ultimate limits of optical computing - a balanced trade-off between device size and accuracy.

Paper Structure

This paper contains 16 sections, 6 equations, 5 figures.

Figures (5)

  • Figure 1: Reducing the spatial complexity of optical computing. Improving the scaling laws of free-space optics (a-c) and photonic chips (d-f) following the proposed space-efficient design paradigm. (a-c) Illustration of simplifying a generic free-space optical system. The thickness scaling law can be reduced to $\mathcal{O}(N^{1/2})$ when the $N \times N$ kernel matrix, representing the input-output function of the optical system, is designed to exhibit a "local sparse" form. Coloured lines represent the coupling coefficients between input and output ports (sampling points of the corresponding field profiles). Visually, a sparse matrix has significantly fewer couplings than a dense one, while in a local kernel matrix, all couplings are only slightly inclined from vertical, whereas nonlocal kernels lack this feature. (d-f) Illustration of simplifying a generic two-dimensional photonic chip, composed of a mesh of Mach–Zehnder interferometers (MZIs). The block-diagonalization, understood from a graph perspective, breaks the $N \times N$ kernel matrix to a linear number $N/N'$ of small, decoupled complete bipartite graphs, $K_{N',N'}$, each of which requires $N'(N'-1)/2$ MZIs to realize. Therefore, when each block is sufficiently small, block-diagonalization reduces the total number of required MZIs from quadratic to quasi-linear.
  • Figure 2: Standard optical nonlocality and overlapping nonlocality. (a) Schematic of a 1D free-space optical system. The communication cone of an output port, defined as the set of all couplings to it, is characterized by two parameters, the horizontal shift $d_{\text{shift}}$ and the spanning range $w_{\text{cone}}$. An ideally local optical device would have both $d_{\text{shift}}$ and $w_{\text{cone}}$ equal to zero. (b) The overlapping nonlocality (ONL) $C$ associated with a transverse aperture (or "cut") is the number of communication cones intersecting it. For example, $C=5$ for the green cut and $C=6$ for the purple cut. (c) Demonstration of calculating $C$ for an arbitrary cut for a 2D free-space optical system. For a given cut, an output contributes a one to $C$ only if its communication cone intersects the cut. (d) Definition of the in-plane distance $d_{\parallel}$ traversed by an optical coupling. For a pair of coupled ports $(i, j)$, $d_{\parallel}$ is the distance between them, projected onto the output (or input) plane. This distance measures the nonlocality of each individual coupling.
  • Figure 3: Scaling laws of three types of optical device kernels. Scaling laws of the maximum ONL, $\max(C)$, with respect to the mathematical operation dimension, $N$, for the three considered types of device kernels: (a) trivial sparse, (b) row sparse, and (c) local sparse kernels. In (a), lines and shaded regions (inset) represent the average and one standard deviation of numerical simulation results. The standard deviation is negligibly small. In (b,c), darker lines represent theoretical values derived from Eq. \ref{['eq:scaling_laws']}, and lighter lines represent numerical simulation results (see Methods and Supplementary Note 1).
  • Figure 4: Space-efficient computing with free-space optics. Schematic of (a) conventional, (b) row sparse, and (c) local sparse ONNs performing a classification task on the fashion-MNIST dataset. (d-f) Thicknesses of the three interlayer regions in conventional (top row), row sparse (middle row), and local sparse (bottom row) ONNs. To demonstrate the trade-off between model thickness and accuracy, each model type is pruned using five different pruning thresholds (see Methods). Error bars represent one standard deviation of thickness (along $y$-axis) and accuracy (along $x$-axis) across eight different random seeds. A local sparse ONN with a pruning threshold of $\tau = 0.01$, compared to a conventional ONN with $\tau = 0.05$, drastically reduces the thickness while only compromising accuracy by $3.6\%$ on the fashion-MNIST dataset (vertical orange arrows; see Supplementary Note 5).
  • Figure 5: Space-efficient computing on photonic chips. (a) Trade-off between model accuracy and the degree of block-diagonalization for different block-diagonal models. Shaded regions represent one standard deviation of accuracy across eight different random seeds. (b-e) Weights of block-diagonal and unpruned models. For each model, the number of free parameters, the involved unitary transformations (including both left and right unitary matrices, $U$ and $V^T$, from the singular value decomposition of all weight matrices), and the total number of required MZIs are listed on the right.