Table of Contents
Fetching ...

FSL-HDnn: A 5.7 TOPS/W End-to-end Few-shot Learning Classifier Accelerator with Feature Extraction and Hyperdimensional Computing

Haichao Yang, Chang Eun Song, Weihong Xu, Behnam Khaleghi, Uday Mallappa, Monil Shah, Keming Fan, Mingu Kang, Tajana Rosing

TL;DR

The paper tackles the difficulty of on-device continual learning with limited data and compute by proposing FSL-HDnn, an end-to-end accelerator that combines a weight clustering CNN feature extractor with a hyperdimensional computing based few-shot learner. The HDC classifier uses a cyclic random projection encoder to generate high-dimensional representations, enabling gradient-free, single-pass FSL updates and significantly reducing data movement. Measured on a 40 nm CMOS prototype, the design achieves strong energy efficiency (5.7 TOPS/W for feature extraction and 0.78 TOPS/W for HDC) and competitive accuracy, with substantial improvements over kNN-based approaches and notable gains over state-of-the-art CNN/FSL accelerators. This work demonstrates practical, low-power on-device learning capability suitable for edge applications requiring rapid adaptation to new classes with minimal training data.

Abstract

This paper introduces FSL-HDnn, an energy-efficient accelerator that implements the end-to-end pipeline of feature extraction, classification, and on-chip few-shot learning (FSL) through gradient-free learning techniques in a 40 nm CMOS process. At its core, FSL-HDnn integrates two low-power modules: Weight clustering feature extractor and Hyperdimensional Computing (HDC). Feature extractor utilizes advanced weight clustering and pattern reuse strategies for optimized CNN-based feature extraction. Meanwhile, HDC emerges as a novel approach for lightweight FSL classifier, employing hyperdimensional vectors to improve training accuracy significantly compared to traditional distance-based approaches. This dual-module synergy not only simplifies the learning process by eliminating the need for complex gradients but also dramatically enhances energy efficiency and performance. Specifically, FSL-HDnn achieves an Intensity unprecedented energy efficiency of 5.7 TOPS/W for feature 1 extraction and 0.78 TOPS/W for classification and learning Training Intensity phases, achieving improvements of 2.6X and 6.6X, respectively, Storage over current state-of-the-art CNN and FSL processors.

FSL-HDnn: A 5.7 TOPS/W End-to-end Few-shot Learning Classifier Accelerator with Feature Extraction and Hyperdimensional Computing

TL;DR

The paper tackles the difficulty of on-device continual learning with limited data and compute by proposing FSL-HDnn, an end-to-end accelerator that combines a weight clustering CNN feature extractor with a hyperdimensional computing based few-shot learner. The HDC classifier uses a cyclic random projection encoder to generate high-dimensional representations, enabling gradient-free, single-pass FSL updates and significantly reducing data movement. Measured on a 40 nm CMOS prototype, the design achieves strong energy efficiency (5.7 TOPS/W for feature extraction and 0.78 TOPS/W for HDC) and competitive accuracy, with substantial improvements over kNN-based approaches and notable gains over state-of-the-art CNN/FSL accelerators. This work demonstrates practical, low-power on-device learning capability suitable for edge applications requiring rapid adaptation to new classes with minimal training data.

Abstract

This paper introduces FSL-HDnn, an energy-efficient accelerator that implements the end-to-end pipeline of feature extraction, classification, and on-chip few-shot learning (FSL) through gradient-free learning techniques in a 40 nm CMOS process. At its core, FSL-HDnn integrates two low-power modules: Weight clustering feature extractor and Hyperdimensional Computing (HDC). Feature extractor utilizes advanced weight clustering and pattern reuse strategies for optimized CNN-based feature extraction. Meanwhile, HDC emerges as a novel approach for lightweight FSL classifier, employing hyperdimensional vectors to improve training accuracy significantly compared to traditional distance-based approaches. This dual-module synergy not only simplifies the learning process by eliminating the need for complex gradients but also dramatically enhances energy efficiency and performance. Specifically, FSL-HDnn achieves an Intensity unprecedented energy efficiency of 5.7 TOPS/W for feature 1 extraction and 0.78 TOPS/W for classification and learning Training Intensity phases, achieving improvements of 2.6X and 6.6X, respectively, Storage over current state-of-the-art CNN and FSL processors.
Paper Structure (7 sections, 14 figures)

This paper contains 7 sections, 14 figures.

Figures (14)

  • Figure 1: Overview of conventional Few-shot learning pipeline with multilayer perceptron (MLP) search and proposed FSL-HDnn pipeline.
  • Figure 2: Proposed end-to-end FSL-HDnn Architecture.
  • Figure 3: Weight clustering: (a) average weight clustering and index for each weight, (b) accumulated input pixel reuse based on common pattern across filters.
  • Figure 4: (a) CNN feature extractor with weight clustering, (b) Feature extractor processing element (PE), (c) PE timing diagram.
  • Figure 5: Benefits from weight clustering: (a) Operations reduction, (b) Parameters reduction.
  • ...and 9 more figures