HEANA: A Hybrid Time-Amplitude Analog Optical Accelerator with Flexible Dataflows for Energy-Efficient CNN Inference
Sairam Sri Vatsavai, Venkata Sai Praneeth Karempudi, Ishan Thakkar
TL;DR
HEANA introduces a hybrid time-amplitude analog optical accelerator for CNN inference that overcomes crosstalk, dataflow rigidity, and limited in-situ accumulation in prior MRR-based designs. By integrating spectrally hitless TAOMs with Balanced Photo-Charge Accumulators, HEANA enables flexible output/input/weight stationary dataflows and in-situ spatio-temporal accumulation, dramatically reducing psum buffers and external reduction networks. System-level evaluations across four CNNs show substantial throughput and energy-efficiency gains over prior incoherent DPUs, with minimal Top-1/Top-5 accuracy loss at 8-bit quantization. These results highlight HEANA’s potential to deliver scalable, energy-efficient photonic CNN acceleration with broad dataflow support.
Abstract
Several photonic microring resonators (MRRs) based analog accelerators have been proposed to accelerate the inference of integer-quantized CNNs with remarkably higher throughput and energy efficiency compared to their electronic counterparts. However, the existing analog photonic accelerators suffer from three shortcomings: (i) severe hampering of wavelength parallelism due to various crosstalk effects, (ii) inflexibility of supporting various dataflows other than the weight-stationary dataflow, and (iii) failure in fully leveraging the ability of photodetectors to perform in-situ accumulations. These shortcomings collectively hamper the performance and energy efficiency of prior accelerators. To tackle these shortcomings, we present a novel Hybrid timE Amplitude aNalog optical Accelerator, called HEANA. HEANA employs hybrid time-amplitude analog optical multipliers (TAOMs) that increase the flexibility of HEANA to support multiple dataflows. A spectrally hitless arrangement of TAOMs significantly reduces the crosstalk effects, thereby increasing the wavelength parallelism in HEANA. Moreover, HEANA employs our invented balanced photo-charge accumulators (BPCAs) that enable buffer-less, in-situ, temporal accumulations to eliminate the need to use reduction networks in HEANA, relieving it from related latency and energy overheads. Our evaluation for the inference of four modern CNNs indicates that HEANA provides improvements of atleast 66x and 84x in frames-per-second (FPS) and FPS/W (energy-efficiency), respectively, for equal-area comparisons, on gmean over two MRR-based analog CNN accelerators from prior work.
