Table of Contents
Fetching ...

The Performance of Compression-Based Denoisers

Dan Song, Ayfer Özgür, Tsachy Weissman

TL;DR

The authors extend compression-based denoising from additive noise to general discrete memoryless channels by selecting a channel-matching distortion ρ(z,y) = -log p_{Z|X}(z|y) and operating at D = H(Z|X). They prove that, under mild mixing and invertibility conditions, lossy compression of Z^n yields reconstructions Y^n whose joint behavior with X^n and Z^n asymptotically samples from the posterior X|Z, enabling an exact, general loss expression: E[Λ_n(X^n,Y^n)] -> E_Z E_{U,V~P_{X|Z}}[Λ(U,V)]. This leads to precise characterizations of denoising performance in special cases (e.g., MSE yields a factor-2 improvement over prior bounds), and clarifies the relationship to indirect rate-distortion and rate-distortion-perception frameworks. The results significantly strengthen prior work and offer a versatile denoising paradigm for diverse channels, with practical implications for posterior sampling and rate-limited observation handling.

Abstract

We consider a denoiser that reconstructs a stationary ergodic source by lossily compressing samples of the source observed through a memoryless noisy channel. Prior work on compression-based denoising has been limited to additive noise channels. We extend this framework to general discrete memoryless channels by deliberately choosing the distortion measure for the lossy compressor to match the channel conditional distribution. By bounding the deviation of the empirical joint distribution of the source, observation, and denoiser outputs from satisfying a Markov property, we give an exact characterization of the loss achieved by such a denoiser. Consequences of these results are explicitly demonstrated in special cases, including for MSE and Hamming loss. A comparison is made to an indirect rate-distortion perspective on the problem.

The Performance of Compression-Based Denoisers

TL;DR

The authors extend compression-based denoising from additive noise to general discrete memoryless channels by selecting a channel-matching distortion ρ(z,y) = -log p_{Z|X}(z|y) and operating at D = H(Z|X). They prove that, under mild mixing and invertibility conditions, lossy compression of Z^n yields reconstructions Y^n whose joint behavior with X^n and Z^n asymptotically samples from the posterior X|Z, enabling an exact, general loss expression: E[Λ_n(X^n,Y^n)] -> E_Z E_{U,V~P_{X|Z}}[Λ(U,V)]. This leads to precise characterizations of denoising performance in special cases (e.g., MSE yields a factor-2 improvement over prior bounds), and clarifies the relationship to indirect rate-distortion and rate-distortion-perception frameworks. The results significantly strengthen prior work and offer a versatile denoising paradigm for diverse channels, with practical implications for posterior sampling and rate-limited observation handling.

Abstract

We consider a denoiser that reconstructs a stationary ergodic source by lossily compressing samples of the source observed through a memoryless noisy channel. Prior work on compression-based denoising has been limited to additive noise channels. We extend this framework to general discrete memoryless channels by deliberately choosing the distortion measure for the lossy compressor to match the channel conditional distribution. By bounding the deviation of the empirical joint distribution of the source, observation, and denoiser outputs from satisfying a Markov property, we give an exact characterization of the loss achieved by such a denoiser. Consequences of these results are explicitly demonstrated in special cases, including for MSE and Hamming loss. A comparison is made to an indirect rate-distortion perspective on the problem.

Paper Structure

This paper contains 14 sections, 13 theorems, 97 equations, 4 figures.

Key Result

Theorem 1

Suppose the alphabets $\mathcal{X}, \mathcal{Z}, \mathcal{Y}$ are finite. Suppose the sequence of codes $\left\{ Y^n\left(\cdot\right) \right\}_n$ is good at $(R\left(\mathbf{Z}, D\right), D)$. Suppose the condition holds, and suppose that $R\left(Z^k, D\right)$ is uniquely achieved by the distribution of the pair $\left(\tilde{Z}^k, \tilde{Y}^k\right)$. Then,

Figures (4)

  • Figure 1: Setting considered in this work. The source $X^n$ and the channel $P_{Z|X}$ are fixed and known, and $\rho$ and $D$ are designed so that the reconstruction $Y^n$ is close to $X^n$.
  • Figure 2: Comparison of the Bayes envelope, compression based denoiser loss, and suboptimal upper bound for denoising a binary $\mathbf{X}$ passed through a BSC channel under Hamming loss (\ref{['ex:bsserasure']}).
  • Figure 3: Performance of compression based denoiser in \ref{['ex:bsserasure']} for various switching probabilities $p_s$ as a function of the erasure probability $p_e$, compared to the Bayes response and the upper bound \ref{['eq:oldupperbound']}. We note that despite the fact that \ref{['thm:five']} is necessary to apply the analysis, \ref{['eq:oldupperbound']} is still a valid upper bound.
  • Figure 4: Performance of compression based denoiser in \ref{['ex:bsserasure']} for various erasure probabilities $p_e$ as a function of the switching probability $p_s$, compared to the Bayes response and the upper bound \ref{['eq:oldupperbound']}.

Theorems & Definitions (24)

  • Definition 1
  • Definition 2
  • Definition 3
  • Definition 4
  • Theorem 1: Theorem 3 of weissman_empirical_2005
  • Theorem 2: Theorem 4 of weissman_empirical_2005
  • Corollary 1
  • Theorem 3: Theorem 5 of weissman_empirical_2005
  • Theorem 4
  • Corollary 2
  • ...and 14 more