Table of Contents
Fetching ...

Enhancing convolutional neural network generalizability via low-rank weight approximation

Chenyin Gao, Shu Yang, Anru R. Zhang

TL;DR

The paper addresses image denoising under limited data by proposing a self-supervised framework that embeds Tucker low-rank weight approximations into a CNN within an ADMM-based optimization. By representing weight kernels as $W = \mathcal{G} \times_3 {U}^{(3)}_{r_3} \times_4 {U}^{(4)}_{r_4}$ and applying weight distortion at twist steps, the method achieves a flatter loss landscape and enhanced out-of-sample generalization. Rank selection is driven by VBMF, enabling data-driven determination of $r_3$ and $r_4$ without manual intervention. Across real-world cryo-EM and synthetic noise datasets, the approach yields competitive or superior PSNR/SSIM compared with both non-learning baselines and supervised methods, while reducing data acquisition costs and computational requirements.

Abstract

Noise is ubiquitous during image acquisition. Sufficient denoising is often an important first step for image processing. In recent decades, deep neural networks (DNNs) have been widely used for image denoising. Most DNN-based image denoising methods require a large-scale dataset or focus on supervised settings, in which single/pairs of clean images or a set of noisy images are required. This poses a significant burden on the image acquisition process. Moreover, denoisers trained on datasets of limited scale may incur over-fitting. To mitigate these issues, we introduce a new self-supervised framework for image denoising based on the Tucker low-rank tensor approximation. With the proposed design, we are able to characterize our denoiser with fewer parameters and train it based on a single image, which considerably improves the model's generalizability and reduces the cost of data acquisition. Extensive experiments on both synthetic and real-world noisy images have been conducted. Empirical results show that our proposed method outperforms existing non-learning-based methods (e.g., low-pass filter, non-local mean), single-image unsupervised denoisers (e.g., DIP, NN+BM3D) evaluated on both in-sample and out-sample datasets. The proposed method even achieves comparable performances with some supervised methods (e.g., DnCNN).

Enhancing convolutional neural network generalizability via low-rank weight approximation

TL;DR

The paper addresses image denoising under limited data by proposing a self-supervised framework that embeds Tucker low-rank weight approximations into a CNN within an ADMM-based optimization. By representing weight kernels as and applying weight distortion at twist steps, the method achieves a flatter loss landscape and enhanced out-of-sample generalization. Rank selection is driven by VBMF, enabling data-driven determination of and without manual intervention. Across real-world cryo-EM and synthetic noise datasets, the approach yields competitive or superior PSNR/SSIM compared with both non-learning baselines and supervised methods, while reducing data acquisition costs and computational requirements.

Abstract

Noise is ubiquitous during image acquisition. Sufficient denoising is often an important first step for image processing. In recent decades, deep neural networks (DNNs) have been widely used for image denoising. Most DNN-based image denoising methods require a large-scale dataset or focus on supervised settings, in which single/pairs of clean images or a set of noisy images are required. This poses a significant burden on the image acquisition process. Moreover, denoisers trained on datasets of limited scale may incur over-fitting. To mitigate these issues, we introduce a new self-supervised framework for image denoising based on the Tucker low-rank tensor approximation. With the proposed design, we are able to characterize our denoiser with fewer parameters and train it based on a single image, which considerably improves the model's generalizability and reduces the cost of data acquisition. Extensive experiments on both synthetic and real-world noisy images have been conducted. Empirical results show that our proposed method outperforms existing non-learning-based methods (e.g., low-pass filter, non-local mean), single-image unsupervised denoisers (e.g., DIP, NN+BM3D) evaluated on both in-sample and out-sample datasets. The proposed method even achieves comparable performances with some supervised methods (e.g., DnCNN).
Paper Structure (22 sections, 10 equations, 16 figures, 4 tables)

This paper contains 22 sections, 10 equations, 16 figures, 4 tables.

Figures (16)

  • Figure 1: A schematic illustration of weight distortion in the stochastic gradient descent training process
  • Figure 2: Illustration of the U-net deep learning framework. The left side (before the Bottleneck) is the encoder and the right side (after the Bottleneck) is the decoder.
  • Figure 3: Visualized in-sample (top) and out-sample (bottom) performance comparison of on two raw micrographs of SARS-CoV-2 2P protein microscopy. The hyper-parameters for the ADMM framework is chosen as $\eta = 0.5, \rho = 100$. Recall BM3D is a non-learning-based method and therefore does not provide a out-sample denoiser.
  • Figure 4: Visual comparisons of our method against other competing methods in terms of in-sample performance from dataset $\textit{SET12}$ coupled with PSNR and SSIM. See more in-sample comparisons in the supplementary materials.
  • Figure 5: A visual comparison of (a) Poisson-Gaussian-noisy image, (b) denoised by BM3D, (c) denoised by NN+BM3D+T, and (d) denoised by NN+BM3D+VST+T ($\eta = 0.5, \rho = 100$ and $S_D = 200$).
  • ...and 11 more figures