A Scalable Training Strategy for Blind Multi-Distribution Noise Removal

Kevin Zhang; Sakshum Kulshrestha; Christopher Metzler

A Scalable Training Strategy for Blind Multi-Distribution Noise Removal

Kevin Zhang, Sakshum Kulshrestha, Christopher Metzler

TL;DR

This work improves upon a recently proposed universal denoiser training strategy by extending these results to higher dimensions and by incorporating a polynomial approximation of the true specification-loss landscape.

Abstract

Despite recent advances, developing general-purpose universal denoising and artifact-removal networks remains largely an open problem: Given fixed network weights, one inherently trades-off specialization at one task (e.g.,~removing Poisson noise) for performance at another (e.g.,~removing speckle noise). In addition, training such a network is challenging due to the curse of dimensionality: As one increases the dimensions of the specification-space (i.e.,~the number of parameters needed to describe the noise distribution) the number of unique specifications one needs to train for grows exponentially. Uniformly sampling this space will result in a network that does well at very challenging problem specifications but poorly at easy problem specifications, where even large errors will have a small effect on the overall mean squared error. In this work we propose training denoising networks using an adaptive-sampling/active-learning strategy. Our work improves upon a recently proposed universal denoiser training strategy by extending these results to higher dimensions and by incorporating a polynomial approximation of the true specification-loss landscape. This approximation allows us to reduce training times by almost two orders of magnitude. We test our method on simulated joint Poisson-Gaussian-Speckle noise and demonstrate that with our proposed training strategy, a single blind, generalist denoiser network can achieve peak signal-to-noise ratios within a uniform bound of specialized denoiser networks across a large range of operating conditions. We also capture a small dataset of images with varying amounts of joint Poisson-Gaussian-Speckle noise and demonstrate that a universal denoiser trained using our adaptive-sampling strategy outperforms uniformly trained baselines.

A Scalable Training Strategy for Blind Multi-Distribution Noise Removal

TL;DR

Abstract

Paper Structure (43 sections, 21 equations, 21 figures, 6 tables)

This paper contains 43 sections, 21 equations, 21 figures, 6 tables.

Introduction
Our Contribution
Related Work
Adaptive Denoising
Universal Denoising
Training Strategies
Relationship to Existing Works
Problem Formulation
Noise Model
Specification-Loss Landscape
The Uniform Gap Problem
Proposed Method
Adaptive Training
Specification-loss Landscape Approximations
Exponential Savings in Training-Time
...and 28 more sections

Figures (21)

Figure 1: Varying the noise specifications. The first row shows images corrupted by Gaussian noise, the second row shows images corrupted by Poisson noise, and the last row shows images corrupted by speckle noise. In each of the rows, the other noise parameters are held fixed at $0$, $0.01$, and $1.00$, respectively.
Figure 2: Loss Landscape Visualizations. PSNR, which we use as our metric for error, versus denoising task specifications. The specification-loss landscapes (which represent the PSNRs a specialized denoiser can achieve at each specification) are smooth and amenable to approximation.
Figure 3: Adaptive vs Uniform Training, 2D Specification Space. Adaptive training with sparse sampling and the polynomial approximation works effectively in the 2D problem space and produces a network whose performance is consistently close to the ideal. By contrast, a network trained by uniformly sampling from the space performs far worse than the specialized networks in certain contexts. The error bars represent one standard-deviation. Lower is better.
Figure 4: Adaptive vs Uniform Training, 3D Specification Space. Adaptive sampling with the polynomial approximation works effectively in the 3D problem space and produces a network whose performance is consistently close to the ideal. By contrast, a network trained by uniformly sampling from the space performs far worse than the specialized networks in certain contexts. The error bars represent one standard-deviation. Lower is better.
Figure 5: Qualitative comparisons, simulated data. Comparison between the performance of the ideal, uniform-trained, and adaptive distribution, sparse sampling-trained denoisers on a sample image corrupted with a low amount of noise and corrupted with a high amount of noise. Our adaptive distribution sparse approximation based blind training strategy performs only marginally worse than an ideal, non-blind baseline when applied to "easy" problem specifications, and significantly better than the uniform baseline, while also being only marginally worse than an ideal baseline and uniform baseline under "hard" problem specifications.
...and 16 more figures

Theorems & Definitions (4)

Example 1
proof
Example 2
proof

A Scalable Training Strategy for Blind Multi-Distribution Noise Removal

TL;DR

Abstract

A Scalable Training Strategy for Blind Multi-Distribution Noise Removal

Authors

TL;DR

Abstract

Table of Contents

Figures (21)

Theorems & Definitions (4)