Seeing the Unseen in Low-light Spike Streams
Liwen Hu, Yang Li, Mianzhi Liu, Yijia Guo, Shenghao Xie, Ziluo Ding, Tiejun Huang, Lei Ma
TL;DR
Diff-SPK addresses the reconstruction of low-light, high-speed spike streams using a diffusion-based pipeline conditioned on Enhanced Texture from Inter-spike Interval (ETFI). By introducing an ETFI encoding and a fusion module, Diff-SPK integrates temporal spike information into a Latent Diffusion Model with ControlNet, enabling high-fidelity texture synthesis in challenging lighting. The approach is validated on a large SA_SPK dataset and across synthetic and real spike-camera data, showing clear improvements over traditional methods and prior diffusion-based approaches, particularly under very dark conditions. This work also provides a first bona fide benchmark for low-light spike-stream reconstruction, highlighting strong generalization to different spike-camera variants and practical relevance for high-speed vision tasks.
Abstract
Spike camera, a type of neuromorphic sensor with high-temporal resolution, shows great promise for high-speed visual tasks. Unlike traditional cameras, spike camera continuously accumulates photons and fires asynchronous spike streams. Due to unique data modality, spike streams require reconstruction methods to become perceptible to the human eye. However, lots of methods struggle to handle spike streams in low-light high-speed scenarios due to severe noise and sparse information. In this work, we propose Diff-SPK, a diffusion-based reconstruction method. Diff-SPK effectively leverages generative priors to supplement texture information under diverse low-light conditions. Specifically, it first employs an Enhanced Texture from Inter-spike Interval (ETFI) to aggregate sparse information from low-light spike streams. Then, the encoded ETFI by a suitable encoder serve as the input of ControlNet for high-speed scenes generation. To improve the quality of results, we introduce an ETFI-based feature fusion module during the generation process.
