Low-Light Image Enhancement via Generative Perceptual Priors
Han Zhou, Wei Dong, Xiaohong Liu, Yulun Zhang, Guangtao Zhai, Jun Chen
TL;DR
This work tackles the variability and realism challenges in low-light image enhancement by introducing GPP-LLIE, a framework that derives global and local perceptual priors from Vision-Language Models to guide a transformer-based diffusion backbone. The perceptual priors are obtained via prompting a pre-trained VLM (LLaVA) to assess contrast, visibility, and sharpness, and are quantified with a sigmoid-based strategy to produce a global score and a local quality map that steer the diffusion process. The diffusion backbone is augmented with GPP-LN and LPP-Attn to incorporate these priors, enabling adaptive enhancement that preserves natural color and textures across diverse real-world lighting. Experimental results show state-of-the-art performance on paired and real-world LLIE datasets, with strong generalization and competitive visual realism, and the approach generalizes to improve other LLIE models as well.
Abstract
Although significant progress has been made in enhancing visibility, retrieving texture details, and mitigating noise in Low-Light (LL) images, the challenge persists in applying current Low-Light Image Enhancement (LLIE) methods to real-world scenarios, primarily due to the diverse illumination conditions encountered. Furthermore, the quest for generating enhancements that are visually realistic and attractive remains an underexplored realm. In response to these challenges, we introduce a novel \textbf{LLIE} framework with the guidance of \textbf{G}enerative \textbf{P}erceptual \textbf{P}riors (\textbf{GPP-LLIE}) derived from vision-language models (VLMs). Specifically, we first propose a pipeline that guides VLMs to assess multiple visual attributes of the LL image and quantify the assessment to output the global and local perceptual priors. Subsequently, to incorporate these generative perceptual priors to benefit LLIE, we introduce a transformer-based backbone in the diffusion process, and develop a new layer normalization (\textit{\textbf{GPP-LN}}) and an attention mechanism (\textit{\textbf{LPP-Attn}}) guided by global and local perceptual priors. Extensive experiments demonstrate that our model outperforms current SOTA methods on paired LL datasets and exhibits superior generalization on real-world data. The code is released at \url{https://github.com/LowLevelAI/GPP-LLIE}.
