Table of Contents
Fetching ...

TabConv: Low-Computation CNN Inference via Table Lookups

Neelesh Gupta, Narayanan Kannan, Pengmiao Zhang, Viktor Prasanna

TL;DR

TabConv tackles the high arithmetic cost of CNN inference by replacing convolutions and related operations with table lookups derived from Product Quantization. It folds batch normalization, maps linear and activation components to tabular forms, and uses a priority masking strategy based on cosine similarity to balance accuracy and computation. The approach yields substantial reductions in arithmetic operations (MFLOPs) while preserving most of the original model accuracy, at the expense of increased storage for the precomputed table entries. This time-space tradeoff presents a practical path toward low-computation CNN inference suitable for specialized hardware and energy-constrained deployments.

Abstract

Convolutional Neural Networks (CNNs) have demonstrated remarkable ability throughout the field of computer vision. However, CNN inference requires a large number of arithmetic operations, making them expensive to deploy in hardware. Current approaches alleviate this issue by developing hardware-supported, algorithmic processes to simplify spatial convolution functions. However, these methods still heavily rely on matrix multiplication, leading to significant computational overhead. To bridge the gap between hardware, algorithmic acceleration, and approximate matrix multiplication, we propose TabConv, a novel, table-based approximation for convolution to significantly reduce arithmetic operations during inference. Additionally, we introduce a priority masking technique based on cosine similarity to select layers for table-based approximation, thereby maintaining the model performance. We evaluate our approach on popular CNNs: ResNet-18, ResNet-34, and NetworkInNetwork (NIN). TabConv preserves over 93% of the original model's performance while reducing arithmetic operations by 36.5%, 25.8%, and 99.4% for ResNet-18 on CIFAR-10, CIFAR-100, and MNIST, respectively, 35.6% and 99.3% for ResNet-34 on CIFAR-10 and MNIST, and 98.9% for NIN on MNIST, achieving low-computation inference.

TabConv: Low-Computation CNN Inference via Table Lookups

TL;DR

TabConv tackles the high arithmetic cost of CNN inference by replacing convolutions and related operations with table lookups derived from Product Quantization. It folds batch normalization, maps linear and activation components to tabular forms, and uses a priority masking strategy based on cosine similarity to balance accuracy and computation. The approach yields substantial reductions in arithmetic operations (MFLOPs) while preserving most of the original model accuracy, at the expense of increased storage for the precomputed table entries. This time-space tradeoff presents a practical path toward low-computation CNN inference suitable for specialized hardware and energy-constrained deployments.

Abstract

Convolutional Neural Networks (CNNs) have demonstrated remarkable ability throughout the field of computer vision. However, CNN inference requires a large number of arithmetic operations, making them expensive to deploy in hardware. Current approaches alleviate this issue by developing hardware-supported, algorithmic processes to simplify spatial convolution functions. However, these methods still heavily rely on matrix multiplication, leading to significant computational overhead. To bridge the gap between hardware, algorithmic acceleration, and approximate matrix multiplication, we propose TabConv, a novel, table-based approximation for convolution to significantly reduce arithmetic operations during inference. Additionally, we introduce a priority masking technique based on cosine similarity to select layers for table-based approximation, thereby maintaining the model performance. We evaluate our approach on popular CNNs: ResNet-18, ResNet-34, and NetworkInNetwork (NIN). TabConv preserves over 93% of the original model's performance while reducing arithmetic operations by 36.5%, 25.8%, and 99.4% for ResNet-18 on CIFAR-10, CIFAR-100, and MNIST, respectively, 35.6% and 99.3% for ResNet-34 on CIFAR-10 and MNIST, and 98.9% for NIN on MNIST, achieving low-computation inference.
Paper Structure (33 sections, 10 equations, 9 figures, 6 tables, 1 algorithm)

This paper contains 33 sections, 10 equations, 9 figures, 6 tables, 1 algorithm.

Figures (9)

  • Figure 1: Training and query of product quantization.
  • Figure 2: Outline of the im2col Method.
  • Figure 3: Workflow of converting a CNN-based model to the proposed TabConv-based model.
  • Figure 4: Table construction for convolutional operation.
  • Figure 5: Table lookup for convolutional operation inference.
  • ...and 4 more figures