Table of Contents
Fetching ...

HyperCam: Low-Power Onboard Computer Vision for IoT Cameras

Chae Young Lee, Pu, Yi, Maxwell Fite, Tejus Rao, Sara Achour, Zerina Kapetanovic

TL;DR

HyperCam introduces a low-power onboard image classifier for IoT cameras using Hyperdimensional Computing with novel image-encoding rewrites and sparse bundling to drastically cut memory and compute. By replacing conventional pixel-position encoding with permutation-based codes, 1D position indexing, value-factorized hypervectors, and sparse bundling backed by Bloom Filter/Count Sketch, HyperCam achieves sub-0.3 s latency and under $60$ KB flash/$20$ KB RAM while maintaining competitive accuracy on MNIST, Fashion-MNIST, and face tasks. The FPGA/MCU prototype on an STM32U585AI demonstrates an energy-efficient, open-source onboard vision solution that outperforms several lightweight ML baselines in memory and latency, with accuracy remaining robust across tasks. The work highlights practical applicability for edge IoT cameras and outlines avenues for onboard training, cloud collaboration, and multi-modal sensing using HDC.

Abstract

We present HyperCam, an energy-efficient image classification pipeline that enables computer vision tasks onboard low-power IoT camera systems. HyperCam leverages hyperdimensional computing to perform training and inference efficiently on low-power microcontrollers. We implement a low-power wireless camera platform using off-the-shelf hardware and demonstrate that HyperCam can achieve an accuracy of 93.60%, 84.06%, 92.98%, and 72.79% for MNIST, Fashion-MNIST, Face Detection, and Face Identification tasks, respectively, while significantly outperforming other classifiers in resource efficiency. Specifically, it delivers inference latency of 0.08-0.27s while using 42.91-63.00KB flash memory and 22.25KB RAM at peak. Among other machine learning classifiers such as SVM, xgBoost, MicroNets, MobileNetV3, and MCUNetV3, HyperCam is the only classifier that achieves competitive accuracy while maintaining competitive memory footprint and inference latency that meets the resource requirements of low-power camera systems.

HyperCam: Low-Power Onboard Computer Vision for IoT Cameras

TL;DR

HyperCam introduces a low-power onboard image classifier for IoT cameras using Hyperdimensional Computing with novel image-encoding rewrites and sparse bundling to drastically cut memory and compute. By replacing conventional pixel-position encoding with permutation-based codes, 1D position indexing, value-factorized hypervectors, and sparse bundling backed by Bloom Filter/Count Sketch, HyperCam achieves sub-0.3 s latency and under KB flash/ KB RAM while maintaining competitive accuracy on MNIST, Fashion-MNIST, and face tasks. The FPGA/MCU prototype on an STM32U585AI demonstrates an energy-efficient, open-source onboard vision solution that outperforms several lightweight ML baselines in memory and latency, with accuracy remaining robust across tasks. The work highlights practical applicability for edge IoT cameras and outlines avenues for onboard training, cloud collaboration, and multi-modal sensing using HDC.

Abstract

We present HyperCam, an energy-efficient image classification pipeline that enables computer vision tasks onboard low-power IoT camera systems. HyperCam leverages hyperdimensional computing to perform training and inference efficiently on low-power microcontrollers. We implement a low-power wireless camera platform using off-the-shelf hardware and demonstrate that HyperCam can achieve an accuracy of 93.60%, 84.06%, 92.98%, and 72.79% for MNIST, Fashion-MNIST, Face Detection, and Face Identification tasks, respectively, while significantly outperforming other classifiers in resource efficiency. Specifically, it delivers inference latency of 0.08-0.27s while using 42.91-63.00KB flash memory and 22.25KB RAM at peak. Among other machine learning classifiers such as SVM, xgBoost, MicroNets, MobileNetV3, and MCUNetV3, HyperCam is the only classifier that achieves competitive accuracy while maintaining competitive memory footprint and inference latency that meets the resource requirements of low-power camera systems.
Paper Structure (31 sections, 11 equations, 9 figures, 4 tables, 2 algorithms)

This paper contains 31 sections, 11 equations, 9 figures, 4 tables, 2 algorithms.

Figures (9)

  • Figure 1: HDC for image classification. HyperCam uses an HD classifier to perform face detection and identification tasks onboard low-power wireless camera platforms.
  • Figure 2: Memory layout of STM32U585AI.
  • Figure 3: Key operations of BSC. (1) Basis vectors are generated for every letter. (2) Binding of data creates a record. (3) Bundling of words creates a set. (4) Permutation is applied to create hypervectors on-the-fly.
  • Figure 4: HyperCam overview. The HyperCam classifier runs onboard a low-power wireless camera platform and has three key components: image encoder, training algorithm, and inference algorithm.
  • Figure 5: Sample images in the collected dataset.
  • ...and 4 more figures