Learning from Dense Events: Towards Fast Spiking Neural Networks Training via Event Dataset Distillation
Shuhan Ye, Yi Yu, Qixin Zhang, Chenqi Kong, Qiangqiang Wu, Kun Wang, Xudong Jiang
TL;DR
This work tackles the costly training of spiking neural networks (SNNs) on densely temporal event streams by introducing PACE, a dataset distillation framework tailored for event data. PACE comprises Spatial-Temporal Densified Spike Matching (ST-DSM) to densify and align spike patterns in space and time and PEQ-N, a straight-through probabilistic event quantizer that preserves gradient flow while producing integer event frames. Across DVS-Gesture, CIFAR10-DVS, and N-MNIST, PACE outperforms existing coreset and distillation baselines, with particularly large gains on dynamic streams and at low/moderate IPC; for example, on N-MNIST with IPC=$1$, it achieves $84.4\%$ accuracy, about $85\%$ of the full training performance, while reducing training time by $>50\times$ and storage by $>6000\times$. The distilled surrogates transfer to other SNN backbones and enable minute-scale training, supporting efficient edge deployment and indicating a practical path toward scalable neuromorphic vision systems.
Abstract
Event cameras sense brightness changes and output binary asynchronous event streams, attracting increasing attention. Their bio-inspired dynamics align well with spiking neural networks (SNNs), offering a promising energy-efficient alternative to conventional vision systems. However, SNNs remain costly to train due to temporal coding, which limits their practical deployment. To alleviate the high training cost of SNNs, we introduce \textbf{PACE} (Phase-Aligned Condensation for Events), the first dataset distillation framework to SNNs and event-based vision. PACE distills a large training dataset into a compact synthetic one that enables fast SNN training, which is achieved by two core modules: \textbf{ST-DSM} and \textbf{PEQ-N}. ST-DSM uses residual membrane potentials to densify spike-based features (SDR) and to perform fine-grained spatiotemporal matching of amplitude and phase (ST-SM), while PEQ-N provides a plug-and-play straight through probabilistic integer quantizer compatible with standard event-frame pipelines. Across DVS-Gesture, CIFAR10-DVS, and N-MNIST datasets, PACE outperforms existing coreset selection and dataset distillation baselines, with particularly strong gains on dynamic event streams and at low or moderate IPC. Specifically, on N-MNIST, it achieves \(84.4\%\) accuracy, about \(85\%\) of the full training set performance, while reducing training time by more than \(50\times\) and storage cost by \(6000\times\), yielding compact surrogates that enable minute-scale SNN training and efficient edge deployment.
