Table of Contents
Fetching ...

Navigating Beyond Dropout: An Intriguing Solution Towards Generalizable Image Super Resolution

Hongjun Wang, Jiyuan Chen, Yinqiang Zheng, Tieyong Zeng

TL;DR

This work addresses the generalization gap in Blind SR under unknown degradations and critiques the use of Dropout as a regularizer due to its detrimental effect on high-frequency detail. It introduces Simple Alignments that regularize training by aligning the first- and second-order feature statistics, both in a linear form and via nonlinear Random Fourier Feature mappings to an RKHS, promoting degradation-invariant representations. The approach is model-agnostic and complements multi-degradation training, with extensive experiments showing consistent PSNR gains and reduced perceptual errors across seven benchmarks and multiple backbones, outperforming Dropout. Overall, the paper demonstrates that feature-statistics alignment is an effective regularization for Blind SR, enabling better restoration of fine details under unseen degradations and highlighting a practical path for improving generalization in image restoration tasks.

Abstract

Deep learning has led to a dramatic leap on Single Image Super-Resolution (SISR) performances in recent years. %Despite the substantial advancement% While most existing work assumes a simple and fixed degradation model (e.g., bicubic downsampling), the research of Blind SR seeks to improve model generalization ability with unknown degradation. Recently, Kong et al pioneer the investigation of a more suitable training strategy for Blind SR using Dropout. Although such method indeed brings substantial generalization improvements via mitigating overfitting, we argue that Dropout simultaneously introduces undesirable side-effect that compromises model's capacity to faithfully reconstruct fine details. We show both the theoretical and experimental analyses in our paper, and furthermore, we present another easy yet effective training strategy that enhances the generalization ability of the model by simply modulating its first and second-order features statistics. Experimental results have shown that our method could serve as a model-agnostic regularization and outperforms Dropout on seven benchmark datasets including both synthetic and real-world scenarios.

Navigating Beyond Dropout: An Intriguing Solution Towards Generalizable Image Super Resolution

TL;DR

This work addresses the generalization gap in Blind SR under unknown degradations and critiques the use of Dropout as a regularizer due to its detrimental effect on high-frequency detail. It introduces Simple Alignments that regularize training by aligning the first- and second-order feature statistics, both in a linear form and via nonlinear Random Fourier Feature mappings to an RKHS, promoting degradation-invariant representations. The approach is model-agnostic and complements multi-degradation training, with extensive experiments showing consistent PSNR gains and reduced perceptual errors across seven benchmarks and multiple backbones, outperforming Dropout. Overall, the paper demonstrates that feature-statistics alignment is an effective regularization for Blind SR, enabling better restoration of fine details under unseen degradations and highlighting a practical path for improving generalization in image restoration tasks.

Abstract

Deep learning has led to a dramatic leap on Single Image Super-Resolution (SISR) performances in recent years. %Despite the substantial advancement% While most existing work assumes a simple and fixed degradation model (e.g., bicubic downsampling), the research of Blind SR seeks to improve model generalization ability with unknown degradation. Recently, Kong et al pioneer the investigation of a more suitable training strategy for Blind SR using Dropout. Although such method indeed brings substantial generalization improvements via mitigating overfitting, we argue that Dropout simultaneously introduces undesirable side-effect that compromises model's capacity to faithfully reconstruct fine details. We show both the theoretical and experimental analyses in our paper, and furthermore, we present another easy yet effective training strategy that enhances the generalization ability of the model by simply modulating its first and second-order features statistics. Experimental results have shown that our method could serve as a model-agnostic regularization and outperforms Dropout on seven benchmark datasets including both synthetic and real-world scenarios.
Paper Structure (11 sections, 1 theorem, 4 equations, 12 figures, 4 tables)

This paper contains 11 sections, 1 theorem, 4 equations, 12 figures, 4 tables.

Key Result

Lemma 3.1

When dropout is applied at a rate $(1-p)$, the interaction $I^{(s)}_{dropout}(i,j)$ only comprises of rewards from patterns with at most $r \sim B(s,p)$ units. Given this, zhang2020interpreting proved that: $\frac{I_{\text{dropout }}^{(r)}(i, j)}{I^{(s)}(i, j)}=\frac{\sum\nolimits_{0\le q\le r}{\tbi

Figures (12)

  • Figure 1: Given a HR image $I^{HR}$ and its LR version $I^{LR}$ in part (a), the visual comparisons of the restored results of SRResNet SRResNet, regularized by Dropout kong2022reflash and our method respectively, are shown in part (b). Part (c) presents the estimated residuals. We could observe that our method gives better visual quality and preserves more vivid details.
  • Figure 1: Visual comparison with and without our approach in “bicubic+noise20+jepg50”. (Zoom in for best view)
  • Figure 2: The MAPE of SRResNet in frequency domain. We can observe that both channel and pixel dropout have inferior performances to the pure SRResNet in middle-high frequency band, thus loosing representation in fine details. On the contrary, our method improves the performances without such side-effect.
  • Figure 2: Visual comparison with and without our approach in “blur2+bicubic+jepg50”. (Zoom in for best view)
  • Figure 3: Comparisons of the channel diversity from frequency perspective. A higher entropy indicates a wider range of frequency bands covered by the model. Note that since it's hard to evaluate a pixel from frequency domain, we only consider from channel dimension and investigate the channel-wise dropout here.
  • ...and 7 more figures

Theorems & Definitions (1)

  • Lemma 3.1