Table of Contents
Fetching ...

Interpretable Unsupervised Joint Denoising and Enhancement for Real-World low-light Scenarios

Huaqiu Li, Xiaowan Hu, Haoqian Wang

TL;DR

This work tackles real-world low-light image restoration by building a zero-reference, interpretable framework for joint denoising and enhancement. It targets complex degradations by modeling image formation with $I = (R + N) \circ L$, leveraging Retinex theory, neighboring-pixel masking for self-supervision, and a frequency-illumination prior encoder (FIcoder) that injects RGB-space DCT priors into a hybrid transformer (DEnet) comprising REFnet, LUMnet, and LCnet. The method learns implicit degradation representations and performs frequency-domain decomposition to disentangle illumination, reflectance, and noise, achieving state-of-the-art results on real-world datasets (LOLv1/LOLv2, SICE, SIDD) while maintaining interpretability through physically grounded priors and decompositions. The approach demonstrates strong denoising, illumination correction, and color fidelity, with ablations confirming the contribution of priors, masking, and adaptive illumination, and the authors provide code for reproducibility.

Abstract

Real-world low-light images often suffer from complex degradations such as local overexposure, low brightness, noise, and uneven illumination. Supervised methods tend to overfit to specific scenarios, while unsupervised methods, though better at generalization, struggle to model these degradations due to the lack of reference images. To address this issue, we propose an interpretable, zero-reference joint denoising and low-light enhancement framework tailored for real-world scenarios. Our method derives a training strategy based on paired sub-images with varying illumination and noise levels, grounded in physical imaging principles and retinex theory. Additionally, we leverage the Discrete Cosine Transform (DCT) to perform frequency domain decomposition in the sRGB space, and introduce an implicit-guided hybrid representation strategy that effectively separates intricate compounded degradations. In the backbone network design, we develop retinal decomposition network guided by implicit degradation representation mechanisms. Extensive experiments demonstrate the superiority of our method. Code will be available at https://github.com/huaqlili/unsupervised-light-enhance-ICLR2025.

Interpretable Unsupervised Joint Denoising and Enhancement for Real-World low-light Scenarios

TL;DR

This work tackles real-world low-light image restoration by building a zero-reference, interpretable framework for joint denoising and enhancement. It targets complex degradations by modeling image formation with , leveraging Retinex theory, neighboring-pixel masking for self-supervision, and a frequency-illumination prior encoder (FIcoder) that injects RGB-space DCT priors into a hybrid transformer (DEnet) comprising REFnet, LUMnet, and LCnet. The method learns implicit degradation representations and performs frequency-domain decomposition to disentangle illumination, reflectance, and noise, achieving state-of-the-art results on real-world datasets (LOLv1/LOLv2, SICE, SIDD) while maintaining interpretability through physically grounded priors and decompositions. The approach demonstrates strong denoising, illumination correction, and color fidelity, with ablations confirming the contribution of priors, masking, and adaptive illumination, and the authors provide code for reproducibility.

Abstract

Real-world low-light images often suffer from complex degradations such as local overexposure, low brightness, noise, and uneven illumination. Supervised methods tend to overfit to specific scenarios, while unsupervised methods, though better at generalization, struggle to model these degradations due to the lack of reference images. To address this issue, we propose an interpretable, zero-reference joint denoising and low-light enhancement framework tailored for real-world scenarios. Our method derives a training strategy based on paired sub-images with varying illumination and noise levels, grounded in physical imaging principles and retinex theory. Additionally, we leverage the Discrete Cosine Transform (DCT) to perform frequency domain decomposition in the sRGB space, and introduce an implicit-guided hybrid representation strategy that effectively separates intricate compounded degradations. In the backbone network design, we develop retinal decomposition network guided by implicit degradation representation mechanisms. Extensive experiments demonstrate the superiority of our method. Code will be available at https://github.com/huaqlili/unsupervised-light-enhance-ICLR2025.

Paper Structure

This paper contains 16 sections, 16 equations, 9 figures, 2 tables.

Figures (9)

  • Figure 1: Compared with state-of-the-art methods clipSCI on the SIDD dataset, our approach achieves the best results in denoising, enhancement, and color fidelity grounded in real-world imaging principles.
  • Figure 2: The pipeline of our proposed method: First, we preprocess the low-light full-resolution image $I$ using pixel masks and gamma-based nonlinear enhancement, generating sub-images with varying illumination and noise levels. These are then processed through Decompose-Net, which uses a transformer architecture integrating hybrid degradation representations, incorporating cross-attention to inject guiding embeddings. Subsequently, LCnet enhances the illumination map.
  • Figure 3: Illustration of the hybrid prior degradation representation guided by multi-head cross attention. After processing, the feature maps exhibit clearer hierarchical structure and reduced noise.
  • Figure 4: The visualization of the five image priors. They represent chromaticity, semantic information, edge contours, and noise intensity.
  • Figure 5: Visual comparison of typical unsupervised enhancement methods in LOL lolv2. Flesh pink boxes indicate the obvious differences.
  • ...and 4 more figures