Table of Contents
Fetching ...

NeuroMoCo: A Neuromorphic Momentum Contrast Learning Method for Spiking Neural Networks

Yuqi Ma, Huamin Wang, Hangchi Shen, Xuemei Chen, Shukai Duan, Shiping Wen

TL;DR

This work tackles the challenge of building robust representations for spiking neural networks (SNNs) on event-based neuromorphic data by introducing NeuroMoCo, a self-supervised momentum-contrast pre-training framework tailored for SNNs. It introduces MixInfoNCE, a time-aware loss that fuses mean-based contrastive cues across the temporal dimension to better capture neuromorphic dynamics. The method employs dual encoders (master and student) with a dynamic queue and momentum updates to pre-train SNN backbones (CNN and Transformer variants) and demonstrates state-of-the-art accuracy on DVS-CIFAR10, DVS128Gesture, and N-Caltech101 using SEW-ResNet-18 and Spikformer-2-256. The findings highlight the effectiveness of SSL pre-training for neuromorphic vision, offering a pathway to more capable and energy-efficient SNNs for event-based perception tasks.

Abstract

Recently, brain-inspired spiking neural networks (SNNs) have attracted great research attention owing to their inherent bio-interpretability, event-triggered properties and powerful perception of spatiotemporal information, which is beneficial to handling event-based neuromorphic datasets. In contrast to conventional static image datasets, event-based neuromorphic datasets present heightened complexity in feature extraction due to their distinctive time series and sparsity characteristics, which influences their classification accuracy. To overcome this challenge, a novel approach termed Neuromorphic Momentum Contrast Learning (NeuroMoCo) for SNNs is introduced in this paper by extending the benefits of self-supervised pre-training to SNNs to effectively stimulate their potential. This is the first time that self-supervised learning (SSL) based on momentum contrastive learning is realized in SNNs. In addition, we devise a novel loss function named MixInfoNCE tailored to their temporal characteristics to further increase the classification accuracy of neuromorphic datasets, which is verified through rigorous ablation experiments. Finally, experiments on DVS-CIFAR10, DVS128Gesture and N-Caltech101 have shown that NeuroMoCo of this paper establishes new state-of-the-art (SOTA) benchmarks: 83.6% (Spikformer-2-256), 98.62% (Spikformer-2-256), and 84.4% (SEW-ResNet-18), respectively.

NeuroMoCo: A Neuromorphic Momentum Contrast Learning Method for Spiking Neural Networks

TL;DR

This work tackles the challenge of building robust representations for spiking neural networks (SNNs) on event-based neuromorphic data by introducing NeuroMoCo, a self-supervised momentum-contrast pre-training framework tailored for SNNs. It introduces MixInfoNCE, a time-aware loss that fuses mean-based contrastive cues across the temporal dimension to better capture neuromorphic dynamics. The method employs dual encoders (master and student) with a dynamic queue and momentum updates to pre-train SNN backbones (CNN and Transformer variants) and demonstrates state-of-the-art accuracy on DVS-CIFAR10, DVS128Gesture, and N-Caltech101 using SEW-ResNet-18 and Spikformer-2-256. The findings highlight the effectiveness of SSL pre-training for neuromorphic vision, offering a pathway to more capable and energy-efficient SNNs for event-based perception tasks.

Abstract

Recently, brain-inspired spiking neural networks (SNNs) have attracted great research attention owing to their inherent bio-interpretability, event-triggered properties and powerful perception of spatiotemporal information, which is beneficial to handling event-based neuromorphic datasets. In contrast to conventional static image datasets, event-based neuromorphic datasets present heightened complexity in feature extraction due to their distinctive time series and sparsity characteristics, which influences their classification accuracy. To overcome this challenge, a novel approach termed Neuromorphic Momentum Contrast Learning (NeuroMoCo) for SNNs is introduced in this paper by extending the benefits of self-supervised pre-training to SNNs to effectively stimulate their potential. This is the first time that self-supervised learning (SSL) based on momentum contrastive learning is realized in SNNs. In addition, we devise a novel loss function named MixInfoNCE tailored to their temporal characteristics to further increase the classification accuracy of neuromorphic datasets, which is verified through rigorous ablation experiments. Finally, experiments on DVS-CIFAR10, DVS128Gesture and N-Caltech101 have shown that NeuroMoCo of this paper establishes new state-of-the-art (SOTA) benchmarks: 83.6% (Spikformer-2-256), 98.62% (Spikformer-2-256), and 84.4% (SEW-ResNet-18), respectively.
Paper Structure (18 sections, 5 equations, 4 figures, 4 tables)

This paper contains 18 sections, 5 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: Overview of NeuroMoCo and subsequent fine-tune. The NeuroMoCo includes an automatically updating queue and the S-Encoder based on momentum sliding average optimization. $x_q$ and $x_k$ are obtained by processing one data sampled from the neuromorphic dataset using two randomly different data augmentation methods. T represents the time dimension of DVS data.
  • Figure 2: Collection and preprocessing of neuromorphic data. The DVS camera collects sparse event data, which are integrated into time frames and stored in a large multi-dimensional tensor according to time series.
  • Figure 3: Overview of Spike-Element-Wise block. We use ADD as the element-wise function (g) and substitute the original PLIF neurons with more versatile LIF neurons.
  • Figure 4: The principle diagram of MixInfoNCE. Following InfoNCE, we obtain the similarity matrix with time dimension T. For T, our MixInfoNCE adopts the strategy of MBC&MAC mixed paradigm.