Fast Window-Based Event Denoising with Spatiotemporal Correlation Enhancement
Huachen Fang, Jinjian Wu, Qibin Hou, Weisheng Dong, Guangming Shi
TL;DR
This work tackles the noise-prone outputs of event-based cameras by introducing a window-based denoising framework that processes stacks of events, enabling real-time performance with high interpretability. It combines a probabilistic temporal analysis (Temporal Window) and a learned spatial prior (Soft Spatial Feature Embedding) within a multi-scale architecture (WedNet) that employs hierarchical spatial feature learning and a bone-event check to preserve object structure. The method formulates denoising as a MAP problem and solves the spatial component via learned convolutional sparse coding, achieving robust denoising across multiple datasets (DVSCLEAN, DVSNOISE20, ED-KoGTL) and significantly faster runtimes than existing DL-based approaches. The practical impact lies in reliable, fast event denoising that improves downstream vision tasks in dynamic, noisy environments, enabling real-time neuromorphic perception systems.
Abstract
Previous deep learning-based event denoising methods mostly suffer from poor interpretability and difficulty in real-time processing due to their complex architecture designs. In this paper, we propose window-based event denoising, which simultaneously deals with a stack of events while existing element-based denoising focuses on one event each time. Besides, we give the theoretical analysis based on probability distributions in both temporal and spatial domains to improve interpretability. In temporal domain, we use timestamp deviations between processing events and central event to judge the temporal correlation and filter out temporal-irrelevant events. In spatial domain, we choose maximum a posteriori (MAP) to discriminate real-world event and noise, and use the learned convolutional sparse coding to optimize the objective function. Based on the theoretical analysis, we build Temporal Window (TW) module and Soft Spatial Feature Embedding (SSFE) module to process temporal and spatial information separately, and construct a novel multi-scale window-based event denoising network, named MSDNet. The high denoising accuracy and fast running speed of our MSDNet enables us to achieve real-time denoising in complex scenes. Extensive experimental results verify the effectiveness and robustness of our MSDNet. Our algorithm can remove event noise effectively and efficiently and improve the performance of downstream tasks.
