Table of Contents
Fetching ...

Masked Pre-training Enables Universal Zero-shot Denoiser

Xiaoxiao Ma, Zhixiang Wei, Yi Jin, Pengyang Ling, Tianle Liu, Ben Wang, Junkang Dai, Huaian Chen

TL;DR

A novel zero-shot denoising paradigm, i.e., Masked Pre-train then Iterative fill (MPI), which first trains model via masking and then employs pre-trained weight for high-quality zero-shot image denoising on a single noisy image.

Abstract

In this work, we observe that model trained on vast general images via masking strategy, has been naturally embedded with their distribution knowledge, thus spontaneously attains the underlying potential for strong image denoising. Based on this observation, we propose a novel zero-shot denoising paradigm, i.e., Masked Pre-train then Iterative fill (MPI). MPI first trains model via masking and then employs pre-trained weight for high-quality zero-shot image denoising on a single noisy image. Concretely, MPI comprises two key procedures: 1) Masked Pre-training involves training model to reconstruct massive natural images with random masking for generalizable representations, gathering the potential for valid zero-shot denoising on images with varying noise degradation and even in distinct image types. 2) Iterative filling exploits pre-trained knowledge for effective zero-shot denoising. It iteratively optimizes the image by leveraging pre-trained weights, focusing on alternate reconstruction of different image parts, and gradually assembles fully denoised image within limited number of iterations. Comprehensive experiments across various noisy scenarios underscore the notable advances of MPI over previous approaches with a marked reduction in inference time. Code available at https://github.com/krennic999/MPI.

Masked Pre-training Enables Universal Zero-shot Denoiser

TL;DR

A novel zero-shot denoising paradigm, i.e., Masked Pre-train then Iterative fill (MPI), which first trains model via masking and then employs pre-trained weight for high-quality zero-shot image denoising on a single noisy image.

Abstract

In this work, we observe that model trained on vast general images via masking strategy, has been naturally embedded with their distribution knowledge, thus spontaneously attains the underlying potential for strong image denoising. Based on this observation, we propose a novel zero-shot denoising paradigm, i.e., Masked Pre-train then Iterative fill (MPI). MPI first trains model via masking and then employs pre-trained weight for high-quality zero-shot image denoising on a single noisy image. Concretely, MPI comprises two key procedures: 1) Masked Pre-training involves training model to reconstruct massive natural images with random masking for generalizable representations, gathering the potential for valid zero-shot denoising on images with varying noise degradation and even in distinct image types. 2) Iterative filling exploits pre-trained knowledge for effective zero-shot denoising. It iteratively optimizes the image by leveraging pre-trained weights, focusing on alternate reconstruction of different image parts, and gradually assembles fully denoised image within limited number of iterations. Comprehensive experiments across various noisy scenarios underscore the notable advances of MPI over previous approaches with a marked reduction in inference time. Code available at https://github.com/krennic999/MPI.
Paper Structure (45 sections, 14 equations, 35 figures, 16 tables, 1 algorithm)

This paper contains 45 sections, 14 equations, 35 figures, 16 tables, 1 algorithm.

Figures (35)

  • Figure 1: (a) Ours surpasses current zero-shot methods with reduced inference time (on CSet with Gaussian $\sigma$=25, see Sec. \ref{['sec:awgn&poisson']}). (b) It shows better generalization across different noise types than current zero-shot & supervised/unsupervised methods (Sec. \ref{['sec:syn_general']}). (c) And can remove spatial correlated real-world noise, results are from SIDD benchmark abdelhamed2018sidd and FMD zhang2019fmd (Sec. \ref{['sec:real_noise']}, Sec. \ref{['sec:medical_noise']}).
  • Figure 2: Example of model trained on ImageNet with 70% pixel-wise masking, denoised image is obtained by directly ensemble of predictions from fixed pre-trained weights ("Directly ensemble"), its performance can be further improved with iterative filling ("+Zero-shot Optim.").
  • Figure 3: Evaluation on an ImageNet subset shows pre-trained model's inherent denoising ability, but performance limited without optimization.
  • Figure 4: An overview of the proposed MPI paradigm consisting Masked Pre-training and Iterative filling. During pre-training $\mathcal{D}_\theta(\cdot)$ learns to reconstruct masked natural images. And the pre-trained weights $\theta$ are saved for zero-shot denoise, i.e., Iterative filling, to denoise a specific noisy image $x$. During zero-shot inference, network is initialized with pre-trained weights $\theta$, then the weights are further optimized on $x$ for $T$ steps, results from $t$-th ($t$=$1,2,\ldots,T$-$1$) optimizing steps are gathered to obtain final denoised prediction $\overline{y}$. Compared to current zero-shot methods, just adding one more step to load a pre-trained model enables faster and high-quality zero-shot denoising.
  • Figure 5: Qualitative denoising results on Gaussian and Poisson noise. The quantitative PSNR/SSIM results are provided underneath. Noisy patches are from CBSD-44 and McMaster-14, respectively. Best viewed in color (zoom-in for better comparison).
  • ...and 30 more figures