Table of Contents
Fetching ...

SNN-PAR: Energy Efficient Pedestrian Attribute Recognition via Spiking Neural Networks

Haiyang Wang, Qian Zhu, Mowen She, Yabo Li, Haoyu Song, Minghe Xu, Xiao Wang

TL;DR

The work targets energy-efficient pedestrian attribute recognition by replacing conventional ANNs with Spiking Neural Networks (SNNs). It introduces a spiking tokenizer and spiking Transformer backbone, followed by a feed-forward attribute head, and employs knowledge distillation from a strong ANN teacher to guide the SNN student. Experiments on PETA, PA100K, and RAPv1 show competitive accuracy-energy trade-offs and demonstrate parameter efficiency relative to large ANN backbones. The results establish a viable path toward energy-aware PAR and motivate future hybrid ANN-SNN designs to further improve performance.

Abstract

Artificial neural network based Pedestrian Attribute Recognition (PAR) has been widely studied in recent years, despite many progresses, however, the energy consumption is still high. To address this issue, in this paper, we propose a Spiking Neural Network (SNN) based framework for energy-efficient attribute recognition. Specifically, we first adopt a spiking tokenizer module to transform the given pedestrian image into spiking feature representations. Then, the output will be fed into the spiking Transformer backbone networks for energy-efficient feature extraction. We feed the enhanced spiking features into a set of feed-forward networks for pedestrian attribute recognition. In addition to the widely used binary cross-entropy loss function, we also exploit knowledge distillation from the artificial neural network to the spiking Transformer network for more accurate attribute recognition. Extensive experiments on three widely used PAR benchmark datasets fully validated the effectiveness of our proposed SNN-PAR framework. The source code of this paper is released on \url{https://github.com/Event-AHU/OpenPAR}.

SNN-PAR: Energy Efficient Pedestrian Attribute Recognition via Spiking Neural Networks

TL;DR

The work targets energy-efficient pedestrian attribute recognition by replacing conventional ANNs with Spiking Neural Networks (SNNs). It introduces a spiking tokenizer and spiking Transformer backbone, followed by a feed-forward attribute head, and employs knowledge distillation from a strong ANN teacher to guide the SNN student. Experiments on PETA, PA100K, and RAPv1 show competitive accuracy-energy trade-offs and demonstrate parameter efficiency relative to large ANN backbones. The results establish a viable path toward energy-aware PAR and motivate future hybrid ANN-SNN designs to further improve performance.

Abstract

Artificial neural network based Pedestrian Attribute Recognition (PAR) has been widely studied in recent years, despite many progresses, however, the energy consumption is still high. To address this issue, in this paper, we propose a Spiking Neural Network (SNN) based framework for energy-efficient attribute recognition. Specifically, we first adopt a spiking tokenizer module to transform the given pedestrian image into spiking feature representations. Then, the output will be fed into the spiking Transformer backbone networks for energy-efficient feature extraction. We feed the enhanced spiking features into a set of feed-forward networks for pedestrian attribute recognition. In addition to the widely used binary cross-entropy loss function, we also exploit knowledge distillation from the artificial neural network to the spiking Transformer network for more accurate attribute recognition. Extensive experiments on three widely used PAR benchmark datasets fully validated the effectiveness of our proposed SNN-PAR framework. The source code of this paper is released on \url{https://github.com/Event-AHU/OpenPAR}.

Paper Structure

This paper contains 19 sections, 12 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: Simulation of the LIF model. Voltage and current rise with the onset of new spikes, resulting in the generation of an output spike if the voltage reaches $v_{th}$, after which it’s set to $v_{r}$, The diagram is re-drawn based on FPGA moursi2024efficient.
  • Figure 2: An overview of our proposed SNN-PAR framework, designed for energy-efficient pedestrian attribute recognition.
  • Figure 3: Visualization of pedestrian attributes predicted by our proposed model. The green attributes are corrected predicted ones.
  • Figure 4: Visualization of heat maps given the corresponding pedestrian attribute.