Table of Contents
Fetching ...

Perception-based Image Denoising via Generative Compression

Nam Nguyen, Thinh Nguyen, Bella Bose

TL;DR

This work reframes image denoising as a perception-constrained lossy compression problem, linking restoration quality to a rate-distortion-perception (RDP) trade-off and establishing a non-asymptotic theory for compression-based maximum-likelihood denoising under AWGN. It introduces two complementary instantiations: CGanDeCompress, a conditional WGAN-based compression denoiser that explicitly tunes the RDP trade-off with LPIPS-driven perceptual loss, and DiffDeCompress, a conditional diffusion-based reconstruction that leverages compressed latents to guide iterative, texture-rich restoration. Theoretical results provide upper bounds on reconstruction error and decoding error probability in terms of rate, distortion, and perceptual constraints. Empirical results on natural images with synthetic Gaussian noise and real-world noisy data (fluorescence microscopy and SIDD) show consistent perceptual gains (lower LPIPS, FID, and PI) while maintaining competitive distortion metrics, demonstrating robustness to distribution shift and domain differences.

Abstract

Image denoising aims to remove noise while preserving structural details and perceptual realism, yet distortion-driven methods often produce over-smoothed reconstructions, especially under strong noise and distribution shift. This paper proposes a generative compression framework for perception-based denoising, where restoration is achieved by reconstructing from entropy-coded latent representations that enforce low-complexity structure, while generative decoders recover realistic textures via perceptual measures such as learned perceptual image patch similarity (LPIPS) loss and Wasserstein distance. Two complementary instantiations are introduced: (i) a conditional Wasserstein GAN (WGAN)-based compression denoiser that explicitly controls the rate-distortion-perception (RDP) trade-off, and (ii) a conditional diffusion-based reconstruction strategy that performs iterative denoising guided by compressed latents. We further establish non-asymptotic guarantees for the compression-based maximum-likelihood denoiser under additive Gaussian noise, including bounds on reconstruction error and decoding error probability. Experiments on synthetic and real-noise benchmarks demonstrate consistent perceptual improvements while maintaining competitive distortion performance.

Perception-based Image Denoising via Generative Compression

TL;DR

This work reframes image denoising as a perception-constrained lossy compression problem, linking restoration quality to a rate-distortion-perception (RDP) trade-off and establishing a non-asymptotic theory for compression-based maximum-likelihood denoising under AWGN. It introduces two complementary instantiations: CGanDeCompress, a conditional WGAN-based compression denoiser that explicitly tunes the RDP trade-off with LPIPS-driven perceptual loss, and DiffDeCompress, a conditional diffusion-based reconstruction that leverages compressed latents to guide iterative, texture-rich restoration. Theoretical results provide upper bounds on reconstruction error and decoding error probability in terms of rate, distortion, and perceptual constraints. Empirical results on natural images with synthetic Gaussian noise and real-world noisy data (fluorescence microscopy and SIDD) show consistent perceptual gains (lower LPIPS, FID, and PI) while maintaining competitive distortion metrics, demonstrating robustness to distribution shift and domain differences.

Abstract

Image denoising aims to remove noise while preserving structural details and perceptual realism, yet distortion-driven methods often produce over-smoothed reconstructions, especially under strong noise and distribution shift. This paper proposes a generative compression framework for perception-based denoising, where restoration is achieved by reconstructing from entropy-coded latent representations that enforce low-complexity structure, while generative decoders recover realistic textures via perceptual measures such as learned perceptual image patch similarity (LPIPS) loss and Wasserstein distance. Two complementary instantiations are introduced: (i) a conditional Wasserstein GAN (WGAN)-based compression denoiser that explicitly controls the rate-distortion-perception (RDP) trade-off, and (ii) a conditional diffusion-based reconstruction strategy that performs iterative denoising guided by compressed latents. We further establish non-asymptotic guarantees for the compression-based maximum-likelihood denoiser under additive Gaussian noise, including bounds on reconstruction error and decoding error probability. Experiments on synthetic and real-noise benchmarks demonstrate consistent perceptual improvements while maintaining competitive distortion performance.
Paper Structure (17 sections, 3 theorems, 43 equations, 8 figures, 3 tables)

This paper contains 17 sections, 3 theorems, 43 equations, 8 figures, 3 tables.

Key Result

Theorem 1

The DP function in prob:D_P_func admits that where $(x)_+=\max(0,x)$.

Figures (8)

  • Figure 1: Samples generated by our methods on KODAK images. From left to right: original, noisy observation with Gaussian noise $\mathcal{N}(0,\sigma^2)$ with $\sigma=50$, and denoised output.
  • Figure 2: Conditional WGAN-based denoising framework.
  • Figure 3: Conditional diffusion-based denoising architecture.
  • Figure 4: Visual comparison of different denoising methods on randomly selected KODAK images with noise level $\sigma=25$. The first two columns show the original images and their noisy observations, followed by results generated by our methods and popular baselines. The last two rows provide zoomed-in views for detailed comparison. Best viewed on screen.
  • Figure 5: Visual comparison of denoising methods on randomly selected DIV2K images at noise level $\sigma=50$. The first two columns present the original images and their noisy observations, followed by the denoised results generated by our methods and popular baselines. Zoomed-in regions are shown in the last two rows for detailed inspection. Best viewed on screen.
  • ...and 3 more figures

Theorems & Definitions (4)

  • Theorem 1: freirich2021theory
  • Theorem 2
  • proof
  • Corollary 1