HYDRA: Hybrid Data Multiplexing and Run-time Layer Configurable DNN Accelerator
Sonu Kumar, Komal Gupta, Gopal Raut, Mukul Lokhande, Santosh Kumar Vishvakarma
TL;DR
HYDRA addresses edge DNN deployment challenges by introducing a layer-multiplexed accelerator that reuses the same hardware to execute networks of varying depth. It couples a $1$-D array of FMA units with a runtime layer-configurable design and a single activation function accessed via a parallel-in-serial-out path (PISO), enabling $L$-layer networks with reduced area and power. Experimental results show reductions of over 90% in power and resource usage relative to state-of-the-art designs, achieving $35.21$ TOPS/W at 100 MHz for a $64:32:32:10$ network. The work demonstrates practical edge deployment potential for DNNs with scalable hardware reuse and configurability, enabling efficient MNIST/CIFAR-10 style workloads.
Abstract
Deep neural networks (DNNs) offer plenty of challenges in executing efficient computation at edge nodes, primarily due to the huge hardware resource demands. The article proposes HYDRA, hybrid data multiplexing, and runtime layer configurable DNN accelerators to overcome the drawbacks. The work proposes a layer-multiplexed approach, which further reuses a single activation function within the execution of a single layer with improved Fused-Multiply-Accumulate (FMA). The proposed approach works in iterative mode to reuse the same hardware and execute different layers in a configurable fashion. The proposed architectures achieve reductions over 90% of power consumption and resource utilization improvements of state-of-the-art works, with 35.21 TOPSW. The proposed architecture reduces the area overhead (N-1) times required in bandwidth, AF and layer architecture. This work shows HYDRA architecture supports optimal DNN computations while improving performance on resource-constrained edge devices.
