Table of Contents
Fetching ...

Personalized Generative Low-light Image Denoising and Enhancement

Xijun Wang, Prateek Chennuri, Dilshan Godaliyadda, Yu Yuan, Bole Ma, Xingguang Zhang, Hamid R. Sheikh, Stanley Chan

TL;DR

The paper addresses the challenge of denoising and enhancing facial imagery captured in low-light by proposing DiffPGD, a diffusion-based framework that personalizes restoration through ID-consistent buffers derived from a user’s clean photo gallery. By combining gallery-driven identity cues with physically grounded buffers (albedo and surface normals) and integrating them as conditioning signals in a diffusion model, DiffPGD preserves individual identity while reducing hallucinations under severe noise. The training objective leverages a conditional diffusion loss on the denoised output, with a forward process that progressively corrupts the image and a reverse process trained to recover it, plus a FiLM-style modulation from the ID buffers. Experiments on simulated and real low-light data demonstrate that DiffPGD achieves superior identity preservation and image quality compared to state-of-the-art baselines, all without fine-tuning per user, highlighting its practical potential for personalized image restoration in mobile photography.

Abstract

Modern cameras' performance in low-light conditions remains suboptimal due to fundamental limitations in photon shot noise and sensor read noise. Generative image restoration methods have shown promising results compared to traditional approaches, but they suffer from hallucinatory content generation when the signal-to-noise ratio (SNR) is low. Leveraging the availability of personalized photo galleries of the users, we introduce Diffusion-based Personalized Generative Denoising (DiffPGD), a new approach that builds a customized diffusion model for individual users. Our key innovation lies in the development of an identity-consistent physical buffer that extracts the physical attributes of the person from the gallery. This ID-consistent physical buffer serves as a robust prior that can be seamlessly integrated into the diffusion model to restore degraded images without the need for fine-tuning. Over a wide range of low-light testing scenarios, we show that DiffPGD achieves superior image denoising and enhancement performance compared to existing diffusion-based denoising approaches. Our project page can be found at \href{https://genai-restore.github.io/DiffPGD/}{\textcolor{purple}{\textbf{https://genai-restore.github.io/DiffPGD/}}}.

Personalized Generative Low-light Image Denoising and Enhancement

TL;DR

The paper addresses the challenge of denoising and enhancing facial imagery captured in low-light by proposing DiffPGD, a diffusion-based framework that personalizes restoration through ID-consistent buffers derived from a user’s clean photo gallery. By combining gallery-driven identity cues with physically grounded buffers (albedo and surface normals) and integrating them as conditioning signals in a diffusion model, DiffPGD preserves individual identity while reducing hallucinations under severe noise. The training objective leverages a conditional diffusion loss on the denoised output, with a forward process that progressively corrupts the image and a reverse process trained to recover it, plus a FiLM-style modulation from the ID buffers. Experiments on simulated and real low-light data demonstrate that DiffPGD achieves superior identity preservation and image quality compared to state-of-the-art baselines, all without fine-tuning per user, highlighting its practical potential for personalized image restoration in mobile photography.

Abstract

Modern cameras' performance in low-light conditions remains suboptimal due to fundamental limitations in photon shot noise and sensor read noise. Generative image restoration methods have shown promising results compared to traditional approaches, but they suffer from hallucinatory content generation when the signal-to-noise ratio (SNR) is low. Leveraging the availability of personalized photo galleries of the users, we introduce Diffusion-based Personalized Generative Denoising (DiffPGD), a new approach that builds a customized diffusion model for individual users. Our key innovation lies in the development of an identity-consistent physical buffer that extracts the physical attributes of the person from the gallery. This ID-consistent physical buffer serves as a robust prior that can be seamlessly integrated into the diffusion model to restore degraded images without the need for fine-tuning. Over a wide range of low-light testing scenarios, we show that DiffPGD achieves superior image denoising and enhancement performance compared to existing diffusion-based denoising approaches. Our project page can be found at \href{https://genai-restore.github.io/DiffPGD/}{\textcolor{purple}{\textbf{https://genai-restore.github.io/DiffPGD/}}}.

Paper Structure

This paper contains 14 sections, 10 equations, 14 figures, 6 tables, 3 algorithms.

Figures (14)

  • Figure 1: The restoration of inputs degraded by noise and low-light conditions is highly ill-posed. By incorporating additional high-quality gallery photos of the same identity, we significantly reduce the solution space, thereby achieving improved identity consistency in the restored images.
  • Figure 2: Preview of the visual results. Using gallery photos from a user's smartphone, we can restore low-light, noisy facial images. Our method produces finer details and better identity compared to state-of-the-art low-light and face restoration approaches like MIRNet zamir2020learning, DiffLL jiang2023low, GFP-GAN wang2021towards, FourierDiff Lv2024FourierPD, and CodeFormer zhou2022towards. Please zoom in for the best visual experience.
  • Figure 3: Incorporating gallery photos and physical buffers enhances identity preservation. Exp 1: Trained on generic images only $\rightarrow$ outputs a generic face. Exp 2: Adds physical buffer conditioning $\rightarrow$ captures some user-specific features. Exp 3: Fine-tuned with gallery photos $\rightarrow$ generates an arbitrary user face. Exp 4: Combines gallery fine-tuning and physical buffer $\rightarrow$ restores the expected user face.
  • Figure 4: For input images affected by mild low-light noise, physical buffers can be accurately extracted from input. However, when encountering the input images degraded by severe low light and noise, precise physical buffers are difficult to obtain from input.
  • Figure 5: ID-consistent physical buffer extractor. The extractor gets the ID physical buffers extracted from target person's gallery photos. It contains a shared encoder across multiple images and a global decoder for predicting the consistent representation. Note we only show albedo here, and the normal extractor follows the similar strategy. The module weights are initialized from the aggregation network in Zhang2021lap and frozen during training.
  • ...and 9 more figures