Table of Contents
Fetching ...

TP-Spikformer: Token Pruned Spiking Transformer

Wenjie Wei, Xiaolong Zhou, Malu Zhang, Ammar Belatreche, Qian Sun, Yimeng Shan, Dehao Zhang, Zijian Zhou, Zeyu Ma, Yang Yang, Haizhou Li

TL;DR

This paper proposes a simple yet effective token pruning method for spiking transformers, termed TP-Spikformer, that reduces storage and computational overhead while maintaining competitive performance and demonstrates the effectiveness, efficiency and scalability of TP-Spikformer through extensive experiments.

Abstract

Spiking neural networks (SNNs) offer an energy-efficient alternative to traditional neural networks due to their event-driven computing paradigm. However, recent advancements in spiking transformers have focused on improving accuracy with large-scale architectures, which require significant computational resources and limit deployment on resource-constrained devices. In this paper, we propose a simple yet effective token pruning method for spiking transformers, termed TP-Spikformer, that reduces storage and computational overhead while maintaining competitive performance. Specifically, we first introduce a heuristic spatiotemporal information-retaining criterion that comprehensively evaluates tokens' importance, assigning higher scores to informative tokens for retention and lower scores to uninformative ones for pruning. Based on this criterion, we propose an information-retaining token pruning framework that employs a block-level early stopping strategy for uninformative tokens, instead of removing them outright. This also helps preserve more information during token pruning. We demonstrate the effectiveness, efficiency and scalability of TP-Spikformer through extensive experiments across diverse architectures, including Spikformer, QKFormer and Spike-driven Transformer V1 and V3, and a range of tasks such as image classification, object detection, semantic segmentation and event-based object tracking. Particularly, TP-Spikformer performs well in a training-free manner. These results reveal its potential as an efficient and practical solution for deploying SNNs in real-world applications with limited computational resources.

TP-Spikformer: Token Pruned Spiking Transformer

TL;DR

This paper proposes a simple yet effective token pruning method for spiking transformers, termed TP-Spikformer, that reduces storage and computational overhead while maintaining competitive performance and demonstrates the effectiveness, efficiency and scalability of TP-Spikformer through extensive experiments.

Abstract

Spiking neural networks (SNNs) offer an energy-efficient alternative to traditional neural networks due to their event-driven computing paradigm. However, recent advancements in spiking transformers have focused on improving accuracy with large-scale architectures, which require significant computational resources and limit deployment on resource-constrained devices. In this paper, we propose a simple yet effective token pruning method for spiking transformers, termed TP-Spikformer, that reduces storage and computational overhead while maintaining competitive performance. Specifically, we first introduce a heuristic spatiotemporal information-retaining criterion that comprehensively evaluates tokens' importance, assigning higher scores to informative tokens for retention and lower scores to uninformative ones for pruning. Based on this criterion, we propose an information-retaining token pruning framework that employs a block-level early stopping strategy for uninformative tokens, instead of removing them outright. This also helps preserve more information during token pruning. We demonstrate the effectiveness, efficiency and scalability of TP-Spikformer through extensive experiments across diverse architectures, including Spikformer, QKFormer and Spike-driven Transformer V1 and V3, and a range of tasks such as image classification, object detection, semantic segmentation and event-based object tracking. Particularly, TP-Spikformer performs well in a training-free manner. These results reveal its potential as an efficient and practical solution for deploying SNNs in real-world applications with limited computational resources.
Paper Structure (40 sections, 14 equations, 7 figures, 16 tables, 1 algorithm)

This paper contains 40 sections, 14 equations, 7 figures, 16 tables, 1 algorithm.

Figures (7)

  • Figure 1: Visualization of token pruning across time step and block with our method. Experiments are conducted on SDT-V1-8-512, and white areas are pruned tokens.
  • Figure 2: The overall workflow of the proposed TP-Spikformer, including the information-retention token pruning framework (top) and the spatiotemporal information-retaining criterion (bottom).
  • Figure 3: Visualization of ground truth and ours, showing its efficacy on diverse downstream tasks.
  • Figure 4: Zero-finetuning accuracy preservation (top) and efficiency gains (bottom) of TP-Spikformer.
  • Figure 5: Visualization of spatial and temporal token scores in the 8th block of SDT-V1-8-768.
  • ...and 2 more figures