CompEvent: Complex-valued Event-RGB Fusion for Low-light Video Enhancement and Deblurring
Mingchen Zhong, Xin Lu, Dong Li, Senyan Xu, Ruixuan Jiang, Xueyang Fu, Baocai Yin
TL;DR
This work tackles the challenge of restoring videos degraded by simultaneous low-light conditions and motion blur. It introduces CompEvent, a complex-valued neural network framework that fuses event data and RGB frames throughout processing via two main components: the Complex Temporal Alignment Gated Recurrent Unit (CTA-GRU) and the Complex Space-Frequency Learning (CSFL) backbone. The method enables full-process spatiotemporal fusion in the complex domain, outperforming state-of-the-art methods on real-world (RELED) and synthetic (LOL-Blur) benchmarks. Results demonstrate the effectiveness of holistic complex fusion for robust low-light video enhancement and deblurring, with notable gains in PSNR and SSIM and solid ablation evidence for each component.
Abstract
Low-light video deblurring poses significant challenges in applications like nighttime surveillance and autonomous driving due to dim lighting and long exposures. While event cameras offer potential solutions with superior low-light sensitivity and high temporal resolution, existing fusion methods typically employ staged strategies, limiting their effectiveness against combined low-light and motion blur degradations. To overcome this, we propose CompEvent, a complex neural network framework enabling holistic full-process fusion of event data and RGB frames for enhanced joint restoration. CompEvent features two core components: 1) Complex Temporal Alignment GRU, which utilizes complex-valued convolutions and processes video and event streams iteratively via GRU to achieve temporal alignment and continuous fusion; and 2) Complex Space-Frequency Learning module, which performs unified complex-valued signal processing in both spatial and frequency domains, facilitating deep fusion through spatial structures and system-level characteristics. By leveraging the holistic representation capability of complex-valued neural networks, CompEvent achieves full-process spatiotemporal fusion, maximizes complementary learning between modalities, and significantly strengthens low-light video deblurring capability. Extensive experiments demonstrate that CompEvent outperforms SOTA methods in addressing this challenging task.
