Table of Contents
Fetching ...

Towards Robust Event-guided Low-Light Image Enhancement: A Large-Scale Real-World Event-Image Dataset and Novel Approach

Guoqiang Liang, Kanghao Chen, Hangyu Li, Yunfan Lu, Lin Wang

TL;DR

This work tackles robust low-light image enhancement by leveraging event cameras and introduces the SDE dataset, a large-scale real-world collection with precise spatial-temporal alignment via a robotic arm. Building on this dataset, it presents EvLight, an event-guided LIE framework that fuses image and event features through a two-stage process: SNR-guided regional feature selection and a holistic-regional fusion branch, enabling robust restoration under challenging illumination and noise. Empirical results on real-world and synthetic data show EvLight outperforms state-of-the-art frame-based and prior event-guided methods in PSNR, PSNR*, and SSIM, with ablations validating the contribution of IRFS/ERFS and SNR guidance. The work provides a valuable dataset and a practical, robust method for real-world LIE applications, with code and data publicly available.

Abstract

Event camera has recently received much attention for low-light image enhancement (LIE) thanks to their distinct advantages, such as high dynamic range. However, current research is prohibitively restricted by the lack of large-scale, real-world, and spatial-temporally aligned event-image datasets. To this end, we propose a real-world (indoor and outdoor) dataset comprising over 30K pairs of images and events under both low and normal illumination conditions. To achieve this, we utilize a robotic arm that traces a consistent non-linear trajectory to curate the dataset with spatial alignment precision under 0.03mm. We then introduce a matching alignment strategy, rendering 90% of our dataset with errors less than 0.01s. Based on the dataset, we propose a novel event-guided LIE approach, called EvLight, towards robust performance in real-world low-light scenes. Specifically, we first design the multi-scale holistic fusion branch to extract holistic structural and textural information from both events and images. To ensure robustness against variations in the regional illumination and noise, we then introduce a Signal-to-Noise-Ratio (SNR)-guided regional feature selection to selectively fuse features of images from regions with high SNR and enhance those with low SNR by extracting regional structure information from events. Extensive experiments on our dataset and the synthetic SDSD dataset demonstrate our EvLight significantly surpasses the frame-based methods. Code and datasets are available at https://vlislab22.github.io/eg-lowlight/.

Towards Robust Event-guided Low-Light Image Enhancement: A Large-Scale Real-World Event-Image Dataset and Novel Approach

TL;DR

This work tackles robust low-light image enhancement by leveraging event cameras and introduces the SDE dataset, a large-scale real-world collection with precise spatial-temporal alignment via a robotic arm. Building on this dataset, it presents EvLight, an event-guided LIE framework that fuses image and event features through a two-stage process: SNR-guided regional feature selection and a holistic-regional fusion branch, enabling robust restoration under challenging illumination and noise. Empirical results on real-world and synthetic data show EvLight outperforms state-of-the-art frame-based and prior event-guided methods in PSNR, PSNR*, and SSIM, with ablations validating the contribution of IRFS/ERFS and SNR guidance. The work provides a valuable dataset and a practical, robust method for real-world LIE applications, with code and data publicly available.

Abstract

Event camera has recently received much attention for low-light image enhancement (LIE) thanks to their distinct advantages, such as high dynamic range. However, current research is prohibitively restricted by the lack of large-scale, real-world, and spatial-temporally aligned event-image datasets. To this end, we propose a real-world (indoor and outdoor) dataset comprising over 30K pairs of images and events under both low and normal illumination conditions. To achieve this, we utilize a robotic arm that traces a consistent non-linear trajectory to curate the dataset with spatial alignment precision under 0.03mm. We then introduce a matching alignment strategy, rendering 90% of our dataset with errors less than 0.01s. Based on the dataset, we propose a novel event-guided LIE approach, called EvLight, towards robust performance in real-world low-light scenes. Specifically, we first design the multi-scale holistic fusion branch to extract holistic structural and textural information from both events and images. To ensure robustness against variations in the regional illumination and noise, we then introduce a Signal-to-Noise-Ratio (SNR)-guided regional feature selection to selectively fuse features of images from regions with high SNR and enhance those with low SNR by extracting regional structure information from events. Extensive experiments on our dataset and the synthetic SDSD dataset demonstrate our EvLight significantly surpasses the frame-based methods. Code and datasets are available at https://vlislab22.github.io/eg-lowlight/.
Paper Structure (12 sections, 6 equations, 8 figures, 4 tables)

This paper contains 12 sections, 6 equations, 8 figures, 4 tables.

Figures (8)

  • Figure 1: A challenging example of our dataset containing an extremely low-light image (a) and sparse events (b). Compared with the result from a SOTA frame-based method Retinexformer cai2023retinexformer (c), our EvLight (d) not only recovers the structure details (e.g., the pipe on the ceiling) but also avoids over-enhancement and saturation in the bright regions (e.g., the lights).
  • Figure 2: (a) An illustration of collecting spatially-aligned image-event dataset by mounting a DAVIS 346 event camera on the robotic arm and recording the sequences with the same trajectory receptively. (b) An overview of our matching alignment strategy. (c) An example of our dataset with images and paired events captured in low-light (with an ND8 filter) and normal-light conditions.
  • Figure 3: An overview of our framework. Our method consists of three parts, (a) Preprocessing (Sec. \ref{['sec:preprocessing']}), (b) SNR-guided Regional Feature Selection (Sec. \ref{['sec:regional']}), and (c) Holistic-Regional Fusion Branch (Sec. \ref{['sec:holistic']}). Specifically, SNR-guided Regional Feature Selection consists of two parts: Image-Regional Feature Selection (IRFS) and Event-Regional Feature Selection (ERFS). Additionally, Holistic-Regional Fusion Branch encompasses Holistic Feature Extraction (HFE) and Holistic-Regional Feature Fusion (HRF).
  • Figure 4: Details of each block in SNR-guided Regional Feature Selection and Holistic-Regional Fusion Branch's decoder.
  • Figure 5: Qualitative results on our SDE-in dataset.
  • ...and 3 more figures