EncodingNet: A Novel Encoding-based MAC Design for Efficient Neural Network Acceleration

Bo Liu; Grace Li Zhang; Xunzhao Yin; Ulf Schlichtmann; Bing Li

EncodingNet: A Novel Encoding-based MAC Design for Efficient Neural Network Acceleration

Bo Liu, Grace Li Zhang, Xunzhao Yin, Ulf Schlichtmann, Bing Li

TL;DR

This work proposes a novel digital MAC design based on encoding that reduces circuit area and power consumption of executing DNNs by up to 64.41%, while the accuracy of the neural networks can still be well maintained.

Abstract

Deep neural networks (DNNs) have achieved great breakthroughs in many fields such as image classification and natural language processing. However, the execution of DNNs needs to conduct massive numbers of multiply-accumulate (MAC) operations on hardware and thus incurs a large power consumption. To address this challenge, we propose a novel digital MAC design based on encoding. In this new design, the multipliers are replaced by simple logic gates to represent the results with a wide bit representation. The outputs of the new multipliers are added by bit-wise weighted accumulation and the accumulation results are compatible with existing computing platforms accelerating neural networks. Since the multiplication function is replaced by a simple logic representation, the critical paths in the resulting circuits become much shorter. Correspondingly, pipelining stages and intermediate registers used to store partial sums in the MAC array can be reduced, leading to a significantly smaller area as well as better power efficiency. The proposed design has been synthesized and verified by ResNet18- Cifar10, ResNet20-Cifar100, ResNet50-ImageNet, MobileNetV2-Cifar10, MobileNetV2-Cifar100, and EfficientNetB0-ImageNet. The experimental results confirmed the reduction of circuit area by up to 48.79% and the reduction of power consumption of executing DNNs by up to 64.41%, while the accuracy of the neural networks can still be well maintained. The open source code of this work can be found on GitHub with link https://github.com/Bo-Liu-TUM/EncodingNet/.

EncodingNet: A Novel Encoding-based MAC Design for Efficient Neural Network Acceleration

TL;DR

Abstract

Paper Structure (30 sections, 2 equations, 14 figures, 8 tables)

This paper contains 30 sections, 2 equations, 14 figures, 8 tables.

Introduction
Concept and Comparison with Related Work
Encoding-based MAC Design
Multiplier Design by Structural Search
Concept of the 2D Node Array
Description of the 2D Node Array
Search Algorithm for the Multiplier Design
Adder Design with Encoding
Design and Application of MAC Array
Fine-tune Neural Networks
Experimental Setup and Results
Experimental Setup
Setup for Hardware Synthesis
Setup for Neural Network Inference and Fine-tuning
Setup for CGP search
...and 15 more sections

Figures (14)

Figure 1: (a) Structure of systolic array according to TPU. (b) Structure of an MAC unit.
Figure 2: Truth tables and corresponding multipliers with traditional two's complement encoding and a new encoding. (a) Truth tables of multipliers with the traditional encoding and a new encoding, where the position weights are shown at the bottom. (b) The traditional 2-bit signed multiplier. (c) The multiplier with a new encoding, called an encoding-based multiplier.
Figure 3: An example of 8-bit candidate multiplier represented with a 2D regular node array, which consists of $16$ input nodes, $m$ output nodes, and internal nodes with $r$ rows (logic depths) and $c$ columns (logic levels). All input nodes and internal nodes are numbered, i.e., 0, 1, $\dots$, 15, 16, $\dots$, 15+$r$, $\dots$, 15+$rc$. The function of each internal node can be any logic in a gate library $\Gamma=$ {$0^{identity}$, $1^{not}$, $2^{and}$, $3^{or}$, $4^{xor}$, $5^{nand}$, $6^{nor}$, $7^{xnor}$, $8^{const0}$, $9^{const1}$}, where the numbers, i.e., $0, 1, \dots, 9$, denote the types of the logic gates. The inputs of internal nodes and output nodes can be connected to previous nodes. The representation of 2D node array can be coded with a chain of integers, as shown at the bottom.
Figure 4: Two examples of logic mapping from input bits to output bits of a candidate multiplier. The position weights evaluated in each example are shown at the outputs. The resulting maximal relative error of each candidate is illustrated at the bottom. (a) A circuit candidate with a large maximal relative error. (b) A circuit candidate with a small maximal relative error.
Figure 5: A column in a MAC array consists of encoding-based multipliers and the circuit of the addition function, which consists of bit-wise accumulator (ACC) and a decoder (DEC).
...and 9 more figures

EncodingNet: A Novel Encoding-based MAC Design for Efficient Neural Network Acceleration

TL;DR

Abstract

EncodingNet: A Novel Encoding-based MAC Design for Efficient Neural Network Acceleration

Authors

TL;DR

Abstract

Table of Contents

Figures (14)