Table of Contents
Fetching ...

Light-Bound Transformers: Hardware-Anchored Robustness for Silicon-Photonic Computer Vision Systems

Xuming Chen, Deniz Najafi, Chengwei Zhou, Pietro Mercati, Arman Roohi, Mohsen Imani, Mahdi Nikdast, Shaahin Angizi, Gourav Datta

Abstract

Deploying Vision Transformers (ViTs) on near-sensor analog accelerators demands training pipelines that are explicitly aligned with device-level noise and energy constraints. We introduce a compact framework for silicon-photonic execution of ViTs that integrates measured hardware noise, robust attention training, and an energy-aware processing flow. We first characterize bank-level noise in microring-resonator (MR) arrays, including fabrication variation, thermal drift, and amplitude noise, and convert these measurements into closed-form, activation-dependent variance proxies for attention logits and feed-forward activations. Using these proxies, we develop Chance-Constrained Training (CCT), which enforces variance-normalized logit margins to bound attention rank flips, and a noise-aware LayerNorm that stabilizes feature statistics without changing the optical schedule. These components yield a practical ``measure $\rightarrow$ model $\rightarrow$ train $\rightarrow$ run'' pipeline that optimizes accuracy under noise while respecting system energy limits. Hardware-in-the-loop experiments with MR photonic banks show that our approach restores near-clean accuracy under realistic noise budgets, with no in-situ learning or additional optical MACs.

Light-Bound Transformers: Hardware-Anchored Robustness for Silicon-Photonic Computer Vision Systems

Abstract

Deploying Vision Transformers (ViTs) on near-sensor analog accelerators demands training pipelines that are explicitly aligned with device-level noise and energy constraints. We introduce a compact framework for silicon-photonic execution of ViTs that integrates measured hardware noise, robust attention training, and an energy-aware processing flow. We first characterize bank-level noise in microring-resonator (MR) arrays, including fabrication variation, thermal drift, and amplitude noise, and convert these measurements into closed-form, activation-dependent variance proxies for attention logits and feed-forward activations. Using these proxies, we develop Chance-Constrained Training (CCT), which enforces variance-normalized logit margins to bound attention rank flips, and a noise-aware LayerNorm that stabilizes feature statistics without changing the optical schedule. These components yield a practical ``measure model train run'' pipeline that optimizes accuracy under noise while respecting system energy limits. Hardware-in-the-loop experiments with MR photonic banks show that our approach restores near-clean accuracy under realistic noise budgets, with no in-situ learning or additional optical MACs.

Paper Structure

This paper contains 8 sections, 8 equations, 8 figures, 4 tables.

Figures (8)

  • Figure 1: (a) Fabricated SiPh MR array with >200 identical MR cells (SEM shown). (b) Input and through-port spectra after parameter imprinting via tuning the MR resonance. (c) Multiple MRs in one arm imprint weight values onto the input signal at different wavelengths.
  • Figure 2: Overview of the proposed noisy under-test architecture.
  • Figure 3: (a) Matrix splitting and mapping methodology, (b) Optical matrix-matrix multiplication.
  • Figure 4: (a) Resonance wavelength shift distribution of a 4 randomly selected MR bank placed at 100 random locations on the variation map, where each data point represents the shift observed in an individual MR at a specific location. (b) Variation heat map showing spatial process-induced deviations across the chip used for MR placement analysis.
  • Figure 5: End-to-end noise-aware ViT for photonic hardware. Measured microring-bank statistics inform closed-form variance proxies, driving two algorithmic defenses: chance-constrained attention (CCT) to preserve ranking under noise, and noise-aware LayerNorm for FFN channel stability. Training jointly minimizes task and consistency losses; at deployment, the same weights yield robustness to fabrication, thermal, and laser noise on the photonic core.
  • ...and 3 more figures