Table of Contents
Fetching ...

A&B BNN: Add&Bit-Operation-Only Hardware-Friendly Binary Neural Network

Ruichen Ma, Guanchao Qiao, Yian Liu, Liwei Meng, Ning Ning, Yang Liu, Shaogang Hu

TL;DR

The A&B BNN is proposed to directly remove part of the multiplication operations in a traditional BNN and replace the rest with an equal number of bit operations, introducing the mask layer and the quantized RPReLU structure based on the normalizer-free network architecture.

Abstract

Binary neural networks utilize 1-bit quantized weights and activations to reduce both the model's storage demands and computational burden. However, advanced binary architectures still incorporate millions of inefficient and nonhardware-friendly full-precision multiplication operations. A&B BNN is proposed to directly remove part of the multiplication operations in a traditional BNN and replace the rest with an equal number of bit operations, introducing the mask layer and the quantized RPReLU structure based on the normalizer-free network architecture. The mask layer can be removed during inference by leveraging the intrinsic characteristics of BNN with straightforward mathematical transformations to avoid the associated multiplication operations. The quantized RPReLU structure enables more efficient bit operations by constraining its slope to be integer powers of 2. Experimental results achieved 92.30%, 69.35%, and 66.89% on the CIFAR-10, CIFAR-100, and ImageNet datasets, respectively, which are competitive with the state-of-the-art. Ablation studies have verified the efficacy of the quantized RPReLU structure, leading to a 1.14% enhancement on the ImageNet compared to using a fixed slope RLeakyReLU. The proposed add&bit-operation-only BNN offers an innovative approach for hardware-friendly network architecture.

A&B BNN: Add&Bit-Operation-Only Hardware-Friendly Binary Neural Network

TL;DR

The A&B BNN is proposed to directly remove part of the multiplication operations in a traditional BNN and replace the rest with an equal number of bit operations, introducing the mask layer and the quantized RPReLU structure based on the normalizer-free network architecture.

Abstract

Binary neural networks utilize 1-bit quantized weights and activations to reduce both the model's storage demands and computational burden. However, advanced binary architectures still incorporate millions of inefficient and nonhardware-friendly full-precision multiplication operations. A&B BNN is proposed to directly remove part of the multiplication operations in a traditional BNN and replace the rest with an equal number of bit operations, introducing the mask layer and the quantized RPReLU structure based on the normalizer-free network architecture. The mask layer can be removed during inference by leveraging the intrinsic characteristics of BNN with straightforward mathematical transformations to avoid the associated multiplication operations. The quantized RPReLU structure enables more efficient bit operations by constraining its slope to be integer powers of 2. Experimental results achieved 92.30%, 69.35%, and 66.89% on the CIFAR-10, CIFAR-100, and ImageNet datasets, respectively, which are competitive with the state-of-the-art. Ablation studies have verified the efficacy of the quantized RPReLU structure, leading to a 1.14% enhancement on the ImageNet compared to using a fixed slope RLeakyReLU. The proposed add&bit-operation-only BNN offers an innovative approach for hardware-friendly network architecture.
Paper Structure (15 sections, 8 equations, 10 figures, 6 tables)

This paper contains 15 sections, 8 equations, 10 figures, 6 tables.

Figures (10)

  • Figure 1: The architecture overview of the (a) baseline BN-Free network and (b) proposed A&B BNN. In contrast to the baseline network, the proposed A&B BNN eliminates all multiplication operations. The multiplication resulting from $\beta$ is absorbed into the newly introduced mask layer and can be removed directly during inference. Multiplications induced by both average pooling and $\alpha$ are substituted by equal but more efficient bit operations. Additionally, we introduce the quantized RPReLU structure, effectively removing the multiplication associated with PReLU. Circles represent multiplication operations, diamonds represent bit operations, and $\bigoplus$ represents residual addition.
  • Figure 2: The multiplication operand and corresponding ratio within the BN-Free ReActNet-18 and ReActNet-A structures. For input images with a resolution of $224\times224$, the former generates approximately 4.6 million multiplication operations, while the latter yields approximately 14.7 million.
  • Figure 3: (a) A visualization of gradient approximation techniques. The gradient is transferred through an approximate function that resembles the impulse function. (b) Introduce the mask layer to achieve the same effect.
  • Figure 4: (a) The original structure with multiplication. (b) The equivalent structure, although transforming one multiplication operation into two, can both be eliminated.
  • Figure 5: The slope of RPReLU can be any continuous value greater than 0, while the slope of the proposed quantized RPReLU is only allowed to be an integer power of 2.
  • ...and 5 more figures