Table of Contents
Fetching ...

Towards Realistic Low-Light Image Enhancement via ISP Driven Data Modeling

Zhihua Wang, Yu Long, Qinghua Lin, Kai Zhang, Yazhu Zhang, Yuming Fang, Li Liu, Xiaochun Cao

TL;DR

This work tackles the data scarcity and real-world generalization challenges of low-light image enhancement by introducing an ISP-driven data synthesis pipeline that unprocesses normal-light images to RAW, applies RAW-domain degradations, and re-processes through ISP with varied white balance, color transforms, tone mapping, and gamma correction. The approach generates unlimited paired training data, enabling effective training of a simple vanilla U-Net and improving SOTA LLIE models when retrained with the synthetic data. Extensive experiments across paired and unpaired LLIE benchmarks, as well as high-level perception tasks, show consistent improvements in perceptual quality and task performance, highlighting the practical impact for real-world deployment. The results underscore the importance of RAW-domain synthesis and ISP-aware variability for robust LLIE, offering a scalable path toward more generalizable low-light vision systems.

Abstract

Deep neural networks (DNNs) have recently become the leading method for low-light image enhancement (LLIE). However, despite significant progress, their outputs may still exhibit issues such as amplified noise, incorrect white balance, or unnatural enhancements when deployed in real world applications. A key challenge is the lack of diverse, large scale training data that captures the complexities of low-light conditions and imaging pipelines. In this paper, we propose a novel image signal processing (ISP) driven data synthesis pipeline that addresses these challenges by generating unlimited paired training data. Specifically, our pipeline begins with easily collected high-quality normal-light images, which are first unprocessed into the RAW format using a reverse ISP. We then synthesize low-light degradations directly in the RAW domain. The resulting data is subsequently processed through a series of ISP stages, including white balance adjustment, color space conversion, tone mapping, and gamma correction, with controlled variations introduced at each stage. This broadens the degradation space and enhances the diversity of the training data, enabling the generated data to capture a wide range of degradations and the complexities inherent in the ISP pipeline. To demonstrate the effectiveness of our synthetic pipeline, we conduct extensive experiments using a vanilla UNet model consisting solely of convolutional layers, group normalization, GeLU activation, and convolutional block attention modules (CBAMs). Extensive testing across multiple datasets reveals that the vanilla UNet model trained with our data synthesis pipeline delivers high fidelity, visually appealing enhancement results, surpassing state-of-the-art (SOTA) methods both quantitatively and qualitatively.

Towards Realistic Low-Light Image Enhancement via ISP Driven Data Modeling

TL;DR

This work tackles the data scarcity and real-world generalization challenges of low-light image enhancement by introducing an ISP-driven data synthesis pipeline that unprocesses normal-light images to RAW, applies RAW-domain degradations, and re-processes through ISP with varied white balance, color transforms, tone mapping, and gamma correction. The approach generates unlimited paired training data, enabling effective training of a simple vanilla U-Net and improving SOTA LLIE models when retrained with the synthetic data. Extensive experiments across paired and unpaired LLIE benchmarks, as well as high-level perception tasks, show consistent improvements in perceptual quality and task performance, highlighting the practical impact for real-world deployment. The results underscore the importance of RAW-domain synthesis and ISP-aware variability for robust LLIE, offering a scalable path toward more generalizable low-light vision systems.

Abstract

Deep neural networks (DNNs) have recently become the leading method for low-light image enhancement (LLIE). However, despite significant progress, their outputs may still exhibit issues such as amplified noise, incorrect white balance, or unnatural enhancements when deployed in real world applications. A key challenge is the lack of diverse, large scale training data that captures the complexities of low-light conditions and imaging pipelines. In this paper, we propose a novel image signal processing (ISP) driven data synthesis pipeline that addresses these challenges by generating unlimited paired training data. Specifically, our pipeline begins with easily collected high-quality normal-light images, which are first unprocessed into the RAW format using a reverse ISP. We then synthesize low-light degradations directly in the RAW domain. The resulting data is subsequently processed through a series of ISP stages, including white balance adjustment, color space conversion, tone mapping, and gamma correction, with controlled variations introduced at each stage. This broadens the degradation space and enhances the diversity of the training data, enabling the generated data to capture a wide range of degradations and the complexities inherent in the ISP pipeline. To demonstrate the effectiveness of our synthetic pipeline, we conduct extensive experiments using a vanilla UNet model consisting solely of convolutional layers, group normalization, GeLU activation, and convolutional block attention modules (CBAMs). Extensive testing across multiple datasets reveals that the vanilla UNet model trained with our data synthesis pipeline delivers high fidelity, visually appealing enhancement results, surpassing state-of-the-art (SOTA) methods both quantitatively and qualitatively.

Paper Structure

This paper contains 30 sections, 9 equations, 10 figures, 11 tables.

Figures (10)

  • Figure 1: Visual comparisons of recent SOTA LLIE methods, e.g., SNR-Net xu2022snrnet and Retinexformer cai2023retinexformer, trained on small-scale LOL-v2 yang2019coarseversus our large-scale synthetic data (indicated by bold). Methods only trained on LOL-v2 exhibit issues such as incorrect white balance (top) and abnormal enhancement (bottom). In contrast, SNR-Net and Retinexformer can produce more visually pleasant results.
  • Figure 2: Major components of in-camera image processing pipeline.
  • Figure 3: Comparison of the low-light coverage across different datasets using the exposure adjustment curves, which map the luminance histogram of the low-light images to that of their corresponding normal-light ground truth. A steeper curve suggests a higher degree of underexposure.
  • Figure 4: The overview of the vanilla U-Net, where Conv-GN-GeLU-CBAM represents a sequential combination of convolution, group normalization, GeLU activation function, and the CBAM block. $\oplus$ and $\otimes$ denote tensor addition and multiplication, respectively.
  • Figure 5: Qualitative comparison of various LLIE methods on a representative sample from the LOL-v1 chen2018retinex. FT indicates that vanilla U-Net first trained on synthetic data and then fine-tuned on LOL-v1. We observe vanilla U-Net significantly enhances visibility, reducing noise, and preserving color fidelity. Please zoom in for details.
  • ...and 5 more figures