LDM-ISP: Enhancing Neural ISP for Low Light with Latent Diffusion Models
Qiang Wen, Zhefan Rao, Yazhou Xing, Qifeng Chen
TL;DR
The paper addresses extreme low-light image enhancement by casting LLIE as a generative restoration guided by a pre-trained latent diffusion model. It introduces LDM-ISP, which tamps a frozen Stable Diffusion backbone with lightweight taming modules and uses 2D discrete wavelet transforms to split the task into low-frequency structure generation and high-frequency detail maintenance. By allocating LL to the UNet and HF content to the decoder, the approach exploits distinct generative priors and achieves state-of-the-art perceptual performance on real LLIE datasets. The method demonstrates practical benefits for neural ISP under challenging lighting, reducing the need for extensive dataset collection and full diffusion fine-tuning while delivering high-fidelity, artifact-free sRGB outputs.
Abstract
Enhancing a low-light noisy RAW image into a well-exposed and clean sRGB image is a significant challenge for modern digital cameras. Prior approaches have difficulties in recovering fine-grained details and true colors of the scene under extremely low-light environments due to near-to-zero SNR. Meanwhile, diffusion models have shown significant progress towards general domain image generation. In this paper, we propose to leverage the pre-trained latent diffusion model to perform the neural ISP for enhancing extremely low-light images. Specifically, to tailor the pre-trained latent diffusion model to operate on the RAW domain, we train a set of lightweight taming modules to inject the RAW information into the diffusion denoising process via modulating the intermediate features of UNet. We further observe different roles of UNet denoising and decoder reconstruction in the latent diffusion model, which inspires us to decompose the low-light image enhancement task into latent-space low-frequency content generation and decoding-phase high-frequency detail maintenance. Through extensive experiments on representative datasets, we demonstrate our simple design not only achieves state-of-the-art performance in quantitative evaluations but also shows significant superiority in visual comparisons over strong baselines, which highlight the effectiveness of powerful generative priors for neural ISP under extremely low-light environments. The project page is available at https://csqiangwen.github.io/projects/ldm-isp/
