Table of Contents
Fetching ...

Zero-Reference Low-Light Enhancement via Physical Quadruple Priors

Wenjing Wang, Huan Yang, Jianlong Fu, Jiaying Liu

TL;DR

This work tackles zero-reference low-light enhancement by learning illumination-invariant features from normal-light data using a novel physical quadruple prior derived from Kubelka–Munk light-transfer theory. The prior, consisting of $H$, $C$, $W$, and $O$, serves as an intermediate representation between illumination conditions, and a prior-to-image framework uses a frozen diffusion model (Stable Diffusion) conditioned on these priors to reconstruct normal-light images from low-light inputs; a bypass decoder and a lightweight distillation path address detail preservation and efficiency. Across diverse benchmarks (LOL, MIT FiveK, and unpaired sets), the method delivers robust, interpretable improvements over many unsupervised baselines and approaches supervised performance without requiring low-light training data. This results in a practical zero-reference enhancement pipeline with strong generalization and a scalable, fast inference option.

Abstract

Understanding illumination and reducing the need for supervision pose a significant challenge in low-light enhancement. Current approaches are highly sensitive to data usage during training and illumination-specific hyper-parameters, limiting their ability to handle unseen scenarios. In this paper, we propose a new zero-reference low-light enhancement framework trainable solely with normal light images. To accomplish this, we devise an illumination-invariant prior inspired by the theory of physical light transfer. This prior serves as the bridge between normal and low-light images. Then, we develop a prior-to-image framework trained without low-light data. During testing, this framework is able to restore our illumination-invariant prior back to images, automatically achieving low-light enhancement. Within this framework, we leverage a pretrained generative diffusion model for model ability, introduce a bypass decoder to handle detail distortion, as well as offer a lightweight version for practicality. Extensive experiments demonstrate our framework's superiority in various scenarios as well as good interpretability, robustness, and efficiency. Code is available on our project homepage: http://daooshee.github.io/QuadPrior-Website/

Zero-Reference Low-Light Enhancement via Physical Quadruple Priors

TL;DR

This work tackles zero-reference low-light enhancement by learning illumination-invariant features from normal-light data using a novel physical quadruple prior derived from Kubelka–Munk light-transfer theory. The prior, consisting of , , , and , serves as an intermediate representation between illumination conditions, and a prior-to-image framework uses a frozen diffusion model (Stable Diffusion) conditioned on these priors to reconstruct normal-light images from low-light inputs; a bypass decoder and a lightweight distillation path address detail preservation and efficiency. Across diverse benchmarks (LOL, MIT FiveK, and unpaired sets), the method delivers robust, interpretable improvements over many unsupervised baselines and approaches supervised performance without requiring low-light training data. This results in a practical zero-reference enhancement pipeline with strong generalization and a scalable, fast inference option.

Abstract

Understanding illumination and reducing the need for supervision pose a significant challenge in low-light enhancement. Current approaches are highly sensitive to data usage during training and illumination-specific hyper-parameters, limiting their ability to handle unseen scenarios. In this paper, we propose a new zero-reference low-light enhancement framework trainable solely with normal light images. To accomplish this, we devise an illumination-invariant prior inspired by the theory of physical light transfer. This prior serves as the bridge between normal and low-light images. Then, we develop a prior-to-image framework trained without low-light data. During testing, this framework is able to restore our illumination-invariant prior back to images, automatically achieving low-light enhancement. Within this framework, we leverage a pretrained generative diffusion model for model ability, introduce a bypass decoder to handle detail distortion, as well as offer a lightweight version for practicality. Extensive experiments demonstrate our framework's superiority in various scenarios as well as good interpretability, robustness, and efficiency. Code is available on our project homepage: http://daooshee.github.io/QuadPrior-Website/
Paper Structure (10 sections, 6 equations, 11 figures, 2 tables)

This paper contains 10 sections, 6 equations, 11 figures, 2 tables.

Figures (11)

  • Figure 1: Comparison with a SOTA zero-reference method: SCI SCI. The SCI model, trained on varied datasets like LOL Enhance_RetinexNet and MIT FiveK, yields diverse enhancement results. Nevertheless, none effectively maintains a consistent lighting effect across both dark and moderately dark images. In contrast, our model demonstrates greater robustness across various scenarios.
  • Figure 2: The overall methodology of our zero-reference low-light enhancement approach. Our model is trained to reconstruct images from an illumination-invariant prior (the physical quadruple prior) in the normal light domain. During testing, the model extracts illumination-invariant priors from low-light images and reconstructs them into normal light images.
  • Figure 3: Our illumination-invariant prior and the training process for our prior-to-image model framework. We start by predicting the physical quadruple prior from the input image $I$. During the training phase, the model dynamically learns the linear mapping $\mathcal{W}$ and the layers for predicting the scale $\sigma$. In the process of reconstructing priors into images, a static SD encoder extracts the latent representation $z_0$ from the input image $I$. Following this, we sample noisy latent $z_t$ based on $z_0$. Finally, the physical quadruple prior is encoded by convolutional and transformer modules, and is then merged with a frozen SD U-net to predict both noise $\epsilon$ and $z_0$.
  • Figure 4: Image restoration effect of the SD decoder and ours. (a) Input image $I$, from which we extract latent $z_0$. (b) $z_0$ decoded by the SD decoder. (c) The distorted version of $I$. (d) $z_0$ decoded by our decoder using the encoder features from $\tilde{I}$.
  • Figure 5: The training strategy of our bypass decoder. We distort the input image $I$ into $\tilde{I}$, and allow the decoder to reconstruct $I$ using encoder features from the distorted $\tilde{I}$.
  • ...and 6 more figures