Spike-EVPR: Deep Spiking Residual Networks with SNN-Tailored Representations for Event-Based Visual Place Recognition
Zuntao Liu, Yaohui Li, Chenming Hu, Delei Kong, Junjie Jiang, Zheng Fang
TL;DR
Spike-EVPR introduces spike-compatible representations and a deep spiking residual architecture to tackle event-based visual place recognition. By employing MCS-Tensor and TSS-Tensor representations, a BSR-Encoder, SSD-Extractor, and CDA-Module, the method learns robust global descriptors end-to-end with triplet supervision. The approach achieves state-of-the-art results on Brisbane-Event-VPR and DDD20 while delivering substantial energy savings compared with ANN baselines and prior SNN methods. These findings demonstrate the practicality of energy-efficient, end-to-end SNNs for large-scale EVPR tasks. The work also provides valuable insights into representation learning for SNNs in spatio-temporal event data and sets a strong foundation for neuromorphic deployment in place recognition.
Abstract
Event cameras are ideal for visual place recognition (VPR) in challenging environments due to their high temporal resolution and high dynamic range. However, existing methods convert sparse events into dense frame-like representations for Artificial Neural Networks (ANNs), ignoring event sparsity and incurring high computational cost. Spiking Neural Networks (SNNs) complement event data through discrete spike signals to enable energy-efficient VPR, but their application is hindered by the lack of effective spike-compatible representations and deep architectures capable of learning discriminative global descriptors. To address these limitations, we propose Spike-EVPR, a directly trained, end-to-end SNN framework tailored for event-based VPR. First, we introduce two complementary event representations, MCS-Tensor and TSS-Tensor, designed to reduce temporal redundancy while preserving essential spatio-temporal cues. Furthermore, we propose a deep spiking residual architecture that effectively aggregates these features to generate robust place descriptors. Extensive experiments on the Brisbane-Event-VPR and DDD20 datasets demonstrate that Spike-EVPR achieves state-of-the-art performance, improving Recall@1 by 7.61% and 13.20%, respectively, while significantly reducing energy consumption.
