HURRY: Highly Utilized, Reconfigurable ReRAM-based In-situ Accelerator with Multifunctionality
Hery Shin, Jae-Young Kim, Donghyuk Kim, Joo-Young Kim
TL;DR
Resistive RAM-based in-situ accelerators for CNNs suffer from spatial and temporal underutilization, driven by suboptimal array sizing and excessive data movement. HURRY introduces reconfigurable, multifunctional ReRAM arrays with a Block Activation Scheme and system-level scheduling to boost both spatial and temporal utilization, while reducing periphery overhead. Key contributions include a 512×512 unit ReRAM array per IMA, reconfigurable block activation, Conv/Res/FC, Max/ReLU, and Softmax functional blocks, plus inter-FB and intra-FB scheduling with HMS dataflow. Evaluations show up to $3.35\times$ speedup, $5.72\times$ energy efficiency, and $7.91\times$ area efficiency over baselines, with substantial gains in spatial and temporal utilization, enabling practical, energy-efficient CNN inference on ReRAM-based hardware.
Abstract
Resistive random-access memory (ReRAM) crossbar arrays are suitable for efficient inference computations in neural networks due to their analog general matrix-matrix multiplication (GEMM) capabilities. However, traditional ReRAM-based accelerators suffer from spatial and temporal underutilization. We present HURRY, a reconfigurable and multifunctional ReRAM-based in-situ accelerator. HURRY uses a block activation scheme for concurrent activation of dynamically sized ReRAM portions, enhancing spatial utilization. Additionally, it incorporates functional blocks for convolution, ReLU, max pooling, and softmax computations to improve temporal utilization. System-level scheduling and data mapping strategies further optimize performance. Consequently, HURRY achieves up to 3.35x speedup, 5.72x higher energy efficiency, and 7.91x greater area efficiency compared to current ReRAM-based accelerators.
