Table of Contents
Fetching ...

Robust Network Learning via Inverse Scale Variational Sparsification

Zhiling Zhou, Zirui Liu, Chengming Xu, Yanwei Fu, Xinwei Sun

TL;DR

This work introduces an inverse scale variational sparsification framework within a time-continuous inverse scale space formulation that progressively learns finer-scale features by discerning variational differences between pixels, ultimately preserving only large-scale features in the smoothed image.

Abstract

While neural networks have made significant strides in many AI tasks, they remain vulnerable to a range of noise types, including natural corruptions, adversarial noise, and low-resolution artifacts. Many existing approaches focus on enhancing robustness against specific noise types, limiting their adaptability to others. Previous studies have addressed general robustness by adopting a spectral perspective, which tends to blur crucial features like texture and object contours. Our proposed solution, however, introduces an inverse scale variational sparsification framework within a time-continuous inverse scale space formulation. This framework progressively learns finer-scale features by discerning variational differences between pixels, ultimately preserving only large-scale features in the smoothed image. Unlike frequency-based methods, our approach not only removes noise by smoothing small-scale features where corruptions often occur but also retains high-contrast details such as textures and object contours. Moreover, our framework offers simplicity and efficiency in implementation. By integrating this algorithm into neural network training, we guide the model to prioritize learning large-scale features. We show the efficacy of our approach through enhanced robustness against various noise types.

Robust Network Learning via Inverse Scale Variational Sparsification

TL;DR

This work introduces an inverse scale variational sparsification framework within a time-continuous inverse scale space formulation that progressively learns finer-scale features by discerning variational differences between pixels, ultimately preserving only large-scale features in the smoothed image.

Abstract

While neural networks have made significant strides in many AI tasks, they remain vulnerable to a range of noise types, including natural corruptions, adversarial noise, and low-resolution artifacts. Many existing approaches focus on enhancing robustness against specific noise types, limiting their adaptability to others. Previous studies have addressed general robustness by adopting a spectral perspective, which tends to blur crucial features like texture and object contours. Our proposed solution, however, introduces an inverse scale variational sparsification framework within a time-continuous inverse scale space formulation. This framework progressively learns finer-scale features by discerning variational differences between pixels, ultimately preserving only large-scale features in the smoothed image. Unlike frequency-based methods, our approach not only removes noise by smoothing small-scale features where corruptions often occur but also retains high-contrast details such as textures and object contours. Moreover, our framework offers simplicity and efficiency in implementation. By integrating this algorithm into neural network training, we guide the model to prioritize learning large-scale features. We show the efficacy of our approach through enhanced robustness against various noise types.
Paper Structure (21 sections, 1 theorem, 12 equations, 7 figures, 8 tables, 1 algorithm)

This paper contains 21 sections, 1 theorem, 12 equations, 7 figures, 8 tables, 1 algorithm.

Key Result

Proposition 1

Given $u_k$ and $S_k:=\mathrm{supp}(\gamma_k)$, if $G=(V,E_{S_k^c})$ has $C$ connected components $G_1=(V_1,E_1),...,G_C=(V_C,E_C)$, such that $V=V_1 \cup ... \cup V_C$, then $\widetilde{u}_k$ in Eq. equation (eq.projection) can be determined as follows, with a complexity of $\mathcal{O}(p)$:

Figures (7)

  • Figure 1: (a) Visualization of Inverse Scale Space. As $t$ grows, our method progressively learns finer scale information, until fully recovering the original image. (b) Illustration of the difference between low-frequency components and large-scale features. The first row shows the image, and the second shows visualization via Grad-CAM selvaraju2017grad. Unlike low-frequency images, large-scale images smooth out fine-grained details without blurring important features such as texture and object contours, effectively removing redundant information.
  • Figure 2: (a) An example from CIFAR10 with a cut-off radius $r=6$ (above: low frequency component; below: high frequency component). (b) The feature map of the first convolution layer of ResNet (above: Iterative; below: Vanilla) in Epoch 9 (left), Epoch 59 (middle), and Epoch 159 (right). (c) The expected difference in the frequency domain on CIFAR10: the top (resp. bottom) row shows the difference between the original image and the one with sparsity 0.6 (resp. 0.8).
  • Figure 3: The LRP result of two images from ImageNet100. The first, second, and third rows show results for original images, Gaussian blurred images (containing only low-frequency components), and variational sparse images (containing only large-scale information), respectively. Brighter colors typically indicate higher importance.
  • Figure 4: Illustration of our training procedure and the Graph Algorithm used in acceleration.
  • Figure 5: Visualization of learned features in four images: cat (top-left), Boxer (bottom-left), Frog (top-right), and Cabinet (bottom-right) during iterative training. The top two are from CIFAR-10 and the bottom two are from miniImagenet. In each image, the top and the bottom rows respectively correspond to the vanilla model and our method in Eq. equation (\ref{['eq:iterative']}).
  • ...and 2 more figures

Theorems & Definitions (6)

  • Remark 1
  • Remark 2
  • Remark 3
  • Proposition 1
  • proof : Proof of Prop. \ref{['prop.graph']}
  • Remark 4