NeuroMoCo: A Neuromorphic Momentum Contrast Learning Method for Spiking Neural Networks
Yuqi Ma, Huamin Wang, Hangchi Shen, Xuemei Chen, Shukai Duan, Shiping Wen
TL;DR
This work tackles the challenge of building robust representations for spiking neural networks (SNNs) on event-based neuromorphic data by introducing NeuroMoCo, a self-supervised momentum-contrast pre-training framework tailored for SNNs. It introduces MixInfoNCE, a time-aware loss that fuses mean-based contrastive cues across the temporal dimension to better capture neuromorphic dynamics. The method employs dual encoders (master and student) with a dynamic queue and momentum updates to pre-train SNN backbones (CNN and Transformer variants) and demonstrates state-of-the-art accuracy on DVS-CIFAR10, DVS128Gesture, and N-Caltech101 using SEW-ResNet-18 and Spikformer-2-256. The findings highlight the effectiveness of SSL pre-training for neuromorphic vision, offering a pathway to more capable and energy-efficient SNNs for event-based perception tasks.
Abstract
Recently, brain-inspired spiking neural networks (SNNs) have attracted great research attention owing to their inherent bio-interpretability, event-triggered properties and powerful perception of spatiotemporal information, which is beneficial to handling event-based neuromorphic datasets. In contrast to conventional static image datasets, event-based neuromorphic datasets present heightened complexity in feature extraction due to their distinctive time series and sparsity characteristics, which influences their classification accuracy. To overcome this challenge, a novel approach termed Neuromorphic Momentum Contrast Learning (NeuroMoCo) for SNNs is introduced in this paper by extending the benefits of self-supervised pre-training to SNNs to effectively stimulate their potential. This is the first time that self-supervised learning (SSL) based on momentum contrastive learning is realized in SNNs. In addition, we devise a novel loss function named MixInfoNCE tailored to their temporal characteristics to further increase the classification accuracy of neuromorphic datasets, which is verified through rigorous ablation experiments. Finally, experiments on DVS-CIFAR10, DVS128Gesture and N-Caltech101 have shown that NeuroMoCo of this paper establishes new state-of-the-art (SOTA) benchmarks: 83.6% (Spikformer-2-256), 98.62% (Spikformer-2-256), and 84.4% (SEW-ResNet-18), respectively.
