Table of Contents
Fetching ...

Ternary-Input Binary-Weight CNN Accelerator Design for Miniature Object Classification System with Query-Driven Spatial DVS

Yuyang Li, Swasthik Muloor, Jack Laudati, Nickolas Dematteis, Yidam Park, Hana Kim, Nathan Chang, Inhee Lee

TL;DR

This paper tackles miniature vision systems constrained by power and area by proposing a ternary-input binary-weight CNN accelerator (TBN) that partners with a reconfigurable spatial DVS, enabling efficient object recognition and tracking from a shared sensor. The architecture leverages sparsity-aware zero-skipping and XOR-based multipliers to drastically reduce data movement and MACs, while maintaining CIFAR-10-scale accuracy on DVS-formatted inputs. Key results show inference in 0.44 s at 1.6 mW with 82.56% top-1 accuracy and a FoM improvement of about 7× over prior miniature accelerators. The work demonstrates the practicality of integrating spatial DVS and TBN for low-power, mm-scale vision systems.

Abstract

Miniature imaging systems are essential for space-constrained applications but are limited by memory and power constraints. While machine learning can reduce data size by extracting key features, its high energy demands often exceed the capacity of small batteries. This paper presents a CNN hardware accelerator optimized for object classification in miniature imaging systems. It processes data from a spatial Dynamic Vision Sensor (DVS), reconfigurable to a temporal DVS via pixel sharing, minimizing sensor area. By using ternary DVS outputs and a ternary-input, binary-weight neural network, the design reduces computation and memory needs. Fabricated in 28 nm CMOS, the accelerator cuts data size by 81% and MAC operations by 27%. It achieves 440 ms inference time at just 1.6 mW power consumption, improving the Figure-of-Merit (FoM) by 7.3x over prior CNN accelerators for miniature systems.

Ternary-Input Binary-Weight CNN Accelerator Design for Miniature Object Classification System with Query-Driven Spatial DVS

TL;DR

This paper tackles miniature vision systems constrained by power and area by proposing a ternary-input binary-weight CNN accelerator (TBN) that partners with a reconfigurable spatial DVS, enabling efficient object recognition and tracking from a shared sensor. The architecture leverages sparsity-aware zero-skipping and XOR-based multipliers to drastically reduce data movement and MACs, while maintaining CIFAR-10-scale accuracy on DVS-formatted inputs. Key results show inference in 0.44 s at 1.6 mW with 82.56% top-1 accuracy and a FoM improvement of about 7× over prior miniature accelerators. The work demonstrates the practicality of integrating spatial DVS and TBN for low-power, mm-scale vision systems.

Abstract

Miniature imaging systems are essential for space-constrained applications but are limited by memory and power constraints. While machine learning can reduce data size by extracting key features, its high energy demands often exceed the capacity of small batteries. This paper presents a CNN hardware accelerator optimized for object classification in miniature imaging systems. It processes data from a spatial Dynamic Vision Sensor (DVS), reconfigurable to a temporal DVS via pixel sharing, minimizing sensor area. By using ternary DVS outputs and a ternary-input, binary-weight neural network, the design reduces computation and memory needs. Fabricated in 28 nm CMOS, the accelerator cuts data size by 81% and MAC operations by 27%. It achieves 440 ms inference time at just 1.6 mW power consumption, improving the Figure-of-Merit (FoM) by 7.3x over prior CNN accelerators for miniature systems.

Paper Structure

This paper contains 5 sections, 1 equation, 12 figures, 2 tables.

Figures (12)

  • Figure 1: Target miniature vision system. (a) Object recognition and tracking using a shared image sensor configurable as either spatial and temporal DVS. (b) Contribution of the proposed accelerator.
  • Figure 2: Working mechanisms of spatial DVS. (a) Dataset generation for spatial DVS. (b) Implemented TBN architecture.
  • Figure 3: Different DVS configurations and accuracy results.
  • Figure 4: DVS-based TBN's performance. (a) Data size reduction for the same NN. (b) MAC operation reduction via sparsity awareness.
  • Figure 5: Block diagram of the proposed accelerator.
  • ...and 7 more figures