Table of Contents
Fetching ...

Design Environment of Quantization-Aware Edge AI Hardware for Few-Shot Learning

R. Kanda, N. Onizawa, M. Leonardon, V. Gripon, T. Hanyu

TL;DR

The paper tackles accuracy drift when porting few-shot learning to edge AI hardware by enforcing fixed-point processing across pre-training, evaluation, and hardware inference using Brevitas, implemented within the PEFSL pipeline on Tensil. It demonstrates that 5-bit/5-bit fixed-point with QAT and 6-bit/6-bit fixed-point with PTQ can closely match floating-point accuracy on miniImageNet and CIFAR-FS for 1- and 5-shot tasks with ResNet12, enabling substantial resource reductions on FPGA-like devices. The work contributes a versatile design-and-evaluation environment for edge AI hardware in few-shot learning, identifying practical hardware constraints and outlining avenues for future framework flexibility. Overall, fixed-point quantization across the entire design flow provides accuracy stability while opening opportunities for reduced compute and energy in edge deployments.

Abstract

This study aims to ensure consistency in accuracy throughout the entire design flow in the implementation of edge AI hardware for few-shot learning, by implementing fixed-point data processing in the pre-training and evaluation phases. Specifically, the quantization module, called Brevitas, is applied to implement fixed-point data processing, which allows for arbitrary specification of the bit widths for the integer and fractional parts. Two methods of fixed-point data quantization, quantization-aware training (QAT) and post-training quantization (PTQ), are utilized in Brevitas. With Tensil, which is used in the current design flow, the bit widths of the integer and fractional parts need to be 8 bits each or 16 bits each when implemented in hardware, but performance validation has shown that accuracy comparable to floating-point operations can be maintained even with 6 bits or 5 bits each, indicating potential for further reduction in computational resources. These results clearly contribute to the creation of a versatile design and evaluation environment for edge AI hardware for few-shot learning.

Design Environment of Quantization-Aware Edge AI Hardware for Few-Shot Learning

TL;DR

The paper tackles accuracy drift when porting few-shot learning to edge AI hardware by enforcing fixed-point processing across pre-training, evaluation, and hardware inference using Brevitas, implemented within the PEFSL pipeline on Tensil. It demonstrates that 5-bit/5-bit fixed-point with QAT and 6-bit/6-bit fixed-point with PTQ can closely match floating-point accuracy on miniImageNet and CIFAR-FS for 1- and 5-shot tasks with ResNet12, enabling substantial resource reductions on FPGA-like devices. The work contributes a versatile design-and-evaluation environment for edge AI hardware in few-shot learning, identifying practical hardware constraints and outlining avenues for future framework flexibility. Overall, fixed-point quantization across the entire design flow provides accuracy stability while opening opportunities for reduced compute and energy in edge deployments.

Abstract

This study aims to ensure consistency in accuracy throughout the entire design flow in the implementation of edge AI hardware for few-shot learning, by implementing fixed-point data processing in the pre-training and evaluation phases. Specifically, the quantization module, called Brevitas, is applied to implement fixed-point data processing, which allows for arbitrary specification of the bit widths for the integer and fractional parts. Two methods of fixed-point data quantization, quantization-aware training (QAT) and post-training quantization (PTQ), are utilized in Brevitas. With Tensil, which is used in the current design flow, the bit widths of the integer and fractional parts need to be 8 bits each or 16 bits each when implemented in hardware, but performance validation has shown that accuracy comparable to floating-point operations can be maintained even with 6 bits or 5 bits each, indicating potential for further reduction in computational resources. These results clearly contribute to the creation of a versatile design and evaluation environment for edge AI hardware for few-shot learning.
Paper Structure (14 sections, 2 equations, 6 figures, 4 tables)

This paper contains 14 sections, 2 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: Few-shot learning involves training with a small amount of additional data (support) for each class to be classified, followed by inference using test data (query).
  • Figure 2: Few-shot learning consists of: 1. training a backbone to extract features, 2. learning features from the trained backbone and a few additional data, and 3. executing classification tasks.
  • Figure 3: In the conventional flow (a), the hardware implementation uses fixed-point processing, while the pre-training phase employs floating-point processing. In the proposed flow (b), by quantizing the entire flow to fixed-point, consistency of accuracy is ensured.
  • Figure 4: Quantization-Aware Training (left) is a technique that quantizes the model during the training process, allowing the training to account for the effects of quantization, and Post-Training Quantization (right) is a method that applies quantization later to models that have been previously trained with floating-point processing.
  • Figure 5: Implementation of the convolutional layer. Fixed-point processing is activated by the arguments of the module.
  • ...and 1 more figures