Enhancing convolutional neural network generalizability via low-rank weight approximation
Chenyin Gao, Shu Yang, Anru R. Zhang
TL;DR
The paper addresses image denoising under limited data by proposing a self-supervised framework that embeds Tucker low-rank weight approximations into a CNN within an ADMM-based optimization. By representing weight kernels as $W = \mathcal{G} \times_3 {U}^{(3)}_{r_3} \times_4 {U}^{(4)}_{r_4}$ and applying weight distortion at twist steps, the method achieves a flatter loss landscape and enhanced out-of-sample generalization. Rank selection is driven by VBMF, enabling data-driven determination of $r_3$ and $r_4$ without manual intervention. Across real-world cryo-EM and synthetic noise datasets, the approach yields competitive or superior PSNR/SSIM compared with both non-learning baselines and supervised methods, while reducing data acquisition costs and computational requirements.
Abstract
Noise is ubiquitous during image acquisition. Sufficient denoising is often an important first step for image processing. In recent decades, deep neural networks (DNNs) have been widely used for image denoising. Most DNN-based image denoising methods require a large-scale dataset or focus on supervised settings, in which single/pairs of clean images or a set of noisy images are required. This poses a significant burden on the image acquisition process. Moreover, denoisers trained on datasets of limited scale may incur over-fitting. To mitigate these issues, we introduce a new self-supervised framework for image denoising based on the Tucker low-rank tensor approximation. With the proposed design, we are able to characterize our denoiser with fewer parameters and train it based on a single image, which considerably improves the model's generalizability and reduces the cost of data acquisition. Extensive experiments on both synthetic and real-world noisy images have been conducted. Empirical results show that our proposed method outperforms existing non-learning-based methods (e.g., low-pass filter, non-local mean), single-image unsupervised denoisers (e.g., DIP, NN+BM3D) evaluated on both in-sample and out-sample datasets. The proposed method even achieves comparable performances with some supervised methods (e.g., DnCNN).
