MS2Edge: Towards Energy-Efficient and Crisp Edge Detection with Multi-Scale Residual Learning in SNNs
Yimeng Fan, Changsong Liu, Mingyang Li, Yuzhou Dai, Yanyan Liu, Wei Zhang
TL;DR
This paper tackles the dual challenges of energy-inefficiency and edge crispness in traditional ANN-based edge detectors by introducing MS2Edge, the first SNN-based edge detector. It combines a multi-scale residual spiking backbone (MS2ResNet) with Membrane Deformed Shortcuts and I-LIF neurons, a Spiking Multi-Scale Upsample Block (SMSUB) for detail reconstruction, and Membrane Average Decoding (MAD) for robust multi-time-step outputs, all trained from scratch via surrogate gradients. The model achieves state-of-the-art results on BSDS500, NYUDv2, BIPED, PLDU, and PLDM without pre-trained backbones, while maintaining ultralow energy consumption due to the spike-driven computation and quantization priors of SNNs. This work demonstrates that SNNs can deliver both high-quality, crisp edge maps and substantial energy savings, paving the way for efficient edge detection in neuromorphic and real-time settings, with potential extensions to other low-level vision tasks. The energy model scales with time steps and firing activity as $E_{SNNs}=T\times\left(fr\times E_{ACs}\times \eta_{ACs}+E_{MACs}\times \eta_{MACs}\right)$, illustrating the practical impact of temporal dynamics on efficiency while achieving strong edge fidelity.
Abstract
Edge detection with Artificial Neural Networks (ANNs) has achieved remarkable prog\-ress but faces two major challenges. First, it requires pre-training on large-scale extra data and complex designs for prior knowledge, leading to high energy consumption. Second, the predicted edges perform poorly in crispness and heavily rely on post-processing. Spiking Neural Networks (SNNs), as third generation neural networks, feature quantization and spike-driven computation mechanisms. They inherently provide a strong prior for edge detection in an energy-efficient manner, while its quantization mechanism helps suppress texture artifact interference around true edges, improving prediction crispness. However, the resulting quantization error inevitably introduces sparse edge discontinuities, compromising further enhancement of crispness. To address these challenges, we propose MS2Edge, the first SNN-based model for edge detection. At its core, we build a novel spiking backbone named MS2ResNet that integrates multi-scale residual learning to recover missing boundary lines and generate crisp edges, while combining I-LIF neurons with Membrane-based Deformed Shortcut (MDS) to mitigate quantization errors. The model is complemented by a Spiking Multi-Scale Upsample Block (SMSUB) for detail reconstruction during upsampling and a Membrane Average Decoding (MAD) method for effective integration of edge maps across multiple time steps. Experimental results demonstrate that MS2Edge outperforms ANN-based methods and achieves state-of-the-art performance on the BSDS500, NYUDv2, BIPED, PLDU, and PLDM datasets without pre-trained backbones, while maintaining ultralow energy consumption and generating crisp edge maps without post-processing.
