Table of Contents
Fetching ...

Zero-shot Classification using Hyperdimensional Computing

Samuele Ruffino, Geethan Karunaratne, Michael Hersche, Luca Benini, Abu Sebastian, Abbas Rahimi

TL;DR

The paper tackles zero-shot classification for fine-grained recognition by introducing HDC-ZSC, a hybrid model that couples a trainable image encoder with a stationary, high-dimensional binary attribute encoder and a cosine similarity kernel. It trains in stages—pretraining on ImageNet, attribute extraction with fixed HDC dictionaries, and targeted ZSC fine-tuning—to align image and attribute embeddings, achieving a strong $63.8\%$ top-1 on $CUB-200$ with only $26.6\text{M}$ trainable parameters. Compared with two non-generative baselines, HDC-ZSC delivers $+4.3\%$ and $+9.9\%$ accuracy while using significantly fewer parameters, placing it on the Pareto front for accuracy vs. model size. Ablation studies show best performance with ResNet50 and a projection dimension of $d=1536$, and the work suggests viable hardware implementations for energy-efficient, embedded deployment of ZSC. The results demonstrate that compact, HDC-based attribute representations can yield competitive ZSL performance with substantial parameter efficiency and practical applicability to edge devices.

Abstract

Classification based on Zero-shot Learning (ZSL) is the ability of a model to classify inputs into novel classes on which the model has not previously seen any training examples. Providing an auxiliary descriptor in the form of a set of attributes describing the new classes involved in the ZSL-based classification is one of the favored approaches to solving this challenging task. In this work, inspired by Hyperdimensional Computing (HDC), we propose the use of stationary binary codebooks of symbol-like distributed representations inside an attribute encoder to compactly represent a computationally simple end-to-end trainable model, which we name Hyperdimensional Computing Zero-shot Classifier~(HDC-ZSC). It consists of a trainable image encoder, an attribute encoder based on HDC, and a similarity kernel. We show that HDC-ZSC can be used to first perform zero-shot attribute extraction tasks and, can later be repurposed for Zero-shot Classification tasks with minimal architectural changes and minimal model retraining. HDC-ZSC achieves Pareto optimal results with a 63.8% top-1 classification accuracy on the CUB-200 dataset by having only 26.6 million trainable parameters. Compared to two other state-of-the-art non-generative approaches, HDC-ZSC achieves 4.3% and 9.9% better accuracy, while they require more than 1.85x and 1.72x parameters compared to HDC-ZSC, respectively.

Zero-shot Classification using Hyperdimensional Computing

TL;DR

The paper tackles zero-shot classification for fine-grained recognition by introducing HDC-ZSC, a hybrid model that couples a trainable image encoder with a stationary, high-dimensional binary attribute encoder and a cosine similarity kernel. It trains in stages—pretraining on ImageNet, attribute extraction with fixed HDC dictionaries, and targeted ZSC fine-tuning—to align image and attribute embeddings, achieving a strong top-1 on with only trainable parameters. Compared with two non-generative baselines, HDC-ZSC delivers and accuracy while using significantly fewer parameters, placing it on the Pareto front for accuracy vs. model size. Ablation studies show best performance with ResNet50 and a projection dimension of , and the work suggests viable hardware implementations for energy-efficient, embedded deployment of ZSC. The results demonstrate that compact, HDC-based attribute representations can yield competitive ZSL performance with substantial parameter efficiency and practical applicability to edge devices.

Abstract

Classification based on Zero-shot Learning (ZSL) is the ability of a model to classify inputs into novel classes on which the model has not previously seen any training examples. Providing an auxiliary descriptor in the form of a set of attributes describing the new classes involved in the ZSL-based classification is one of the favored approaches to solving this challenging task. In this work, inspired by Hyperdimensional Computing (HDC), we propose the use of stationary binary codebooks of symbol-like distributed representations inside an attribute encoder to compactly represent a computationally simple end-to-end trainable model, which we name Hyperdimensional Computing Zero-shot Classifier~(HDC-ZSC). It consists of a trainable image encoder, an attribute encoder based on HDC, and a similarity kernel. We show that HDC-ZSC can be used to first perform zero-shot attribute extraction tasks and, can later be repurposed for Zero-shot Classification tasks with minimal architectural changes and minimal model retraining. HDC-ZSC achieves Pareto optimal results with a 63.8% top-1 classification accuracy on the CUB-200 dataset by having only 26.6 million trainable parameters. Compared to two other state-of-the-art non-generative approaches, HDC-ZSC achieves 4.3% and 9.9% better accuracy, while they require more than 1.85x and 1.72x parameters compared to HDC-ZSC, respectively.
Paper Structure (17 sections, 2 equations, 5 figures, 2 tables)

This paper contains 17 sections, 2 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: The general model structure employed for Zero-shot Classification in this work. It comprises two main modules: a pre-trained foundation model image encoder and an HDC-based attribute encoder. The attribute encoder consists of weights with fixed binary vectors and binary vector operations, providing opportunities for implementation in resource-constrained edge devices. Modules filled with gray color remain stationary during inference.
  • Figure 2: Different phases of Zero-shot Classification training. (a) The backbone network is pre-trained on ImageNet data (b) The backbone network and the projection FC are pre-trained on attribute extraction (c) The backbone network and the projection FC are further fine-tuned with training on Zero-shot Classification task with images of classes in the training dataset. Modules filled with gray color are stationary.
  • Figure 3: HDC-ZSC for Inference in Zero-shot Classification. Weights in both image and attribute encoders are stationary.
  • Figure 4: Comparison of Zero-shot Classification accuracy vs model parameter count. Our models HDC-ZSC and Trainable-MLP model are both in the Pareto front.
  • Figure 5: Hyperparameter tuning for HDC-ZSC on validation split (50 disjoint classes).