Table of Contents
Fetching ...

On normalization-equivariance properties of supervised and unsupervised denoising methods: a survey

Sébastien Herbreteau, Charles Kervrann

TL;DR

This survey addresses the challenge of normalization-equivariance in image denoising by comparing supervised, weakly supervised, and unsupervised methods. It highlights Poisson-Gaussian and AWGN noise models, variance-stabilizing transformations, and the diverse architectures from MLPs and CNNs to Transformers, including DnCNN, DRUNet, and SCUNet. Key findings show strong traditional unsupervised denoisers (BM3D, NL-Bayes) still rival or outperform many deep nets on standard benchmarks, while CNNs with attention and transformer-based denoisers push performance forward but often struggle with NE unless architected accordingly. The paper advocates NE-by-design approaches (e.g., affine convolutions, sort pooling) to improve robustness across input scaling and shifting, and it underlines the practical considerations for data requirements, weak supervision, and the role of large-scale supervised datasets in achieving state-of-the-art results.

Abstract

Image denoising is probably the oldest and still one of the most active research topic in image processing. Many methodological concepts have been introduced in the past decades and have improved performances significantly in recent years, especially with the emergence of convolutional neural networks and supervised deep learning. In this paper, we propose a survey of guided tour of supervised and unsupervised learning methods for image denoising, classifying the main principles elaborated during this evolution, with a particular concern given to recent developments in supervised learning. It is conceived as a tutorial organizing in a comprehensive framework current approaches. We give insights on the rationales and limitations of the most performant methods in the literature, and we highlight the common features between many of them. Finally, we focus on on the normalization equivariance properties that is surprisingly not guaranteed with most of supervised methods. It is of paramount importance that intensity shifting or scaling applied to the input image results in a corresponding change in the denoiser output.

On normalization-equivariance properties of supervised and unsupervised denoising methods: a survey

TL;DR

This survey addresses the challenge of normalization-equivariance in image denoising by comparing supervised, weakly supervised, and unsupervised methods. It highlights Poisson-Gaussian and AWGN noise models, variance-stabilizing transformations, and the diverse architectures from MLPs and CNNs to Transformers, including DnCNN, DRUNet, and SCUNet. Key findings show strong traditional unsupervised denoisers (BM3D, NL-Bayes) still rival or outperform many deep nets on standard benchmarks, while CNNs with attention and transformer-based denoisers push performance forward but often struggle with NE unless architected accordingly. The paper advocates NE-by-design approaches (e.g., affine convolutions, sort pooling) to improve robustness across input scaling and shifting, and it underlines the practical considerations for data requirements, weak supervision, and the role of large-scale supervised datasets in achieving state-of-the-art results.

Abstract

Image denoising is probably the oldest and still one of the most active research topic in image processing. Many methodological concepts have been introduced in the past decades and have improved performances significantly in recent years, especially with the emergence of convolutional neural networks and supervised deep learning. In this paper, we propose a survey of guided tour of supervised and unsupervised learning methods for image denoising, classifying the main principles elaborated during this evolution, with a particular concern given to recent developments in supervised learning. It is conceived as a tutorial organizing in a comprehensive framework current approaches. We give insights on the rationales and limitations of the most performant methods in the literature, and we highlight the common features between many of them. Finally, we focus on on the normalization equivariance properties that is surprisingly not guaranteed with most of supervised methods. It is of paramount importance that intensity shifting or scaling applied to the input image results in a corresponding change in the denoiser output.
Paper Structure (49 sections, 1 theorem, 58 equations, 10 figures, 1 table, 2 algorithms)

This paper contains 49 sections, 1 theorem, 58 equations, 10 figures, 1 table, 2 algorithms.

Key Result

Proposition 1

$\forall \: s < t \in \mathbb{R}, \forall \: f : \mathbb{R}^n \mapsto \mathbb{R}^m, \mathcal{T}^{-1}_{s, t} \circ f \circ \mathcal{T}_{s, t}$ is a normalization-equivariant function.

Figures (10)

  • Figure 1: Execution time on CPU for images of size $512\times512$ v.s the average PSNR results on the union of Set12 and BSD68 datasets for Gaussian noise with $\sigma=25$ for popular methods.
  • Figure 2: A $3 \times 3$ 2D convolution (without padding) producing 4 output neurons.
  • Figure 3: The architecture of DnCNN denoising network. Source: dncnn.
  • Figure 4: The architecture of DRUNet denoising network. It takes an additional noise level map as input and combines both U-Net unet and ResNet resnet. "SConv" and "TConv" represent $2 \times 2$ strided convolution and transposed convolution, respectively. Source: drunet.
  • Figure 5: The architecture of SCUNet denoising network. "SConv", "TConv", "RConv" and "SwinT" represent $2 \times 2$ strided convolution, $2 \times 2$ strided transposed convolution, residual "$3 \times 3$ conv + ReLU + $3 \times 3$ conv" block and swin transformer block, respectively. Source: scunet.
  • ...and 5 more figures

Theorems & Definitions (2)

  • Definition 1
  • Proposition 1