Perception-based Image Denoising via Generative Compression
Nam Nguyen, Thinh Nguyen, Bella Bose
TL;DR
This work reframes image denoising as a perception-constrained lossy compression problem, linking restoration quality to a rate-distortion-perception (RDP) trade-off and establishing a non-asymptotic theory for compression-based maximum-likelihood denoising under AWGN. It introduces two complementary instantiations: CGanDeCompress, a conditional WGAN-based compression denoiser that explicitly tunes the RDP trade-off with LPIPS-driven perceptual loss, and DiffDeCompress, a conditional diffusion-based reconstruction that leverages compressed latents to guide iterative, texture-rich restoration. Theoretical results provide upper bounds on reconstruction error and decoding error probability in terms of rate, distortion, and perceptual constraints. Empirical results on natural images with synthetic Gaussian noise and real-world noisy data (fluorescence microscopy and SIDD) show consistent perceptual gains (lower LPIPS, FID, and PI) while maintaining competitive distortion metrics, demonstrating robustness to distribution shift and domain differences.
Abstract
Image denoising aims to remove noise while preserving structural details and perceptual realism, yet distortion-driven methods often produce over-smoothed reconstructions, especially under strong noise and distribution shift. This paper proposes a generative compression framework for perception-based denoising, where restoration is achieved by reconstructing from entropy-coded latent representations that enforce low-complexity structure, while generative decoders recover realistic textures via perceptual measures such as learned perceptual image patch similarity (LPIPS) loss and Wasserstein distance. Two complementary instantiations are introduced: (i) a conditional Wasserstein GAN (WGAN)-based compression denoiser that explicitly controls the rate-distortion-perception (RDP) trade-off, and (ii) a conditional diffusion-based reconstruction strategy that performs iterative denoising guided by compressed latents. We further establish non-asymptotic guarantees for the compression-based maximum-likelihood denoiser under additive Gaussian noise, including bounds on reconstruction error and decoding error probability. Experiments on synthetic and real-noise benchmarks demonstrate consistent perceptual improvements while maintaining competitive distortion performance.
