High Perceptual Quality Image Denoising with a Posterior Sampling CGAN
Guy Ohayon, Theo Adrai, Gregory Vaksman, Michael Elad, Peyman Milanfar
TL;DR
The paper tackles the perceptual quality vs. distortion dilemma in image denoising by proposing PSCGAN, a posterior-sampling CGAN that samples from $\mathbb{P}_{\mathbf{x}|\mathbf{y}}$ and introduces a mean-consistency penalty to prevent mode collapse. The method uses a StyleGAN2/UNet-inspired encoder–decoder with multi-scale noise injections and a gradient-penalized WGAN objective, enabling diverse, sharp denoised outputs while preserving fidelity on average. A dual-capability denoiser is presented: PSCGAN for posterior sampling and PSCGAN-A that averages multiple samples to approximate MMSE, with experiments on FFHQ and LSUN datasets showing strong perceptual quality (low FID) and competitive PSNR relative to MMSE, across noise levels. The results highlight the method’s ability to traverse the perception-distortion tradeoff, maintain Gaussian-like remainder noise, and provide practically valuable, non-blurry denoising suitable for high-noise scenarios.
Abstract
The vast work in Deep Learning (DL) has led to a leap in image denoising research. Most DL solutions for this task have chosen to put their efforts on the denoiser's architecture while maximizing distortion performance. However, distortion driven solutions lead to blurry results with sub-optimal perceptual quality, especially in immoderate noise levels. In this paper we propose a different perspective, aiming to produce sharp and visually pleasing denoised images that are still faithful to their clean sources. Formally, our goal is to achieve high perceptual quality with acceptable distortion. This is attained by a stochastic denoiser that samples from the posterior distribution, trained as a generator in the framework of conditional generative adversarial networks (CGAN). Contrary to distortion-based regularization terms that conflict with perceptual quality, we introduce to the CGAN objective a theoretically founded penalty term that does not force a distortion requirement on individual samples, but rather on their mean. We showcase our proposed method with a novel denoiser architecture that achieves the reformed denoising goal and produces vivid and diverse outcomes in immoderate noise levels.
