Table of Contents
Fetching ...

Spectral Roll-off Points Variations: Exploring Useful Information in Feature Maps by Its Variations

Yunkai Yu, Yuyang You, Zhihong Yang, Guozheng Liu, Peiyao Li, Zhicheng Yang, Wenjing Shan

TL;DR

Inspired by the low-Nyqusit-frequency nature of UI, the use of spectral roll-off points (SROPs) are proposed to estimate UI on variations, which promotes the explainability of data representations with respect to frequency-domain knowledge.

Abstract

Useful information (UI) is an elusive concept in neural networks. A quantitative measurement of UI is absent, despite the variations of UI can be recognized by prior knowledge. The communication bandwidth of feature maps decreases after downscaling operations, but UI flows smoothly after training due to lower Nyquist frequency. Inspired by the low-Nyqusit-frequency nature of UI, we propose the use of spectral roll-off points (SROPs) to estimate UI on variations. The computation of an SROP is extended from a 1-D signal to a 2-D image by the required rotation invariance in image classification tasks. SROP statistics across feature maps are implemented as layer-wise useful information estimates. We design sanity checks to explore SROP variations when UI variations are produced by variations in model input, model architecture and training stages. The variations of SROP is synchronizes with UI variations in various randomized and sufficiently trained model structures. Therefore, SROP variations is an accurate and convenient sign of UI variations, which promotes the explainability of data representations with respect to frequency-domain knowledge.

Spectral Roll-off Points Variations: Exploring Useful Information in Feature Maps by Its Variations

TL;DR

Inspired by the low-Nyqusit-frequency nature of UI, the use of spectral roll-off points (SROPs) are proposed to estimate UI on variations, which promotes the explainability of data representations with respect to frequency-domain knowledge.

Abstract

Useful information (UI) is an elusive concept in neural networks. A quantitative measurement of UI is absent, despite the variations of UI can be recognized by prior knowledge. The communication bandwidth of feature maps decreases after downscaling operations, but UI flows smoothly after training due to lower Nyquist frequency. Inspired by the low-Nyqusit-frequency nature of UI, we propose the use of spectral roll-off points (SROPs) to estimate UI on variations. The computation of an SROP is extended from a 1-D signal to a 2-D image by the required rotation invariance in image classification tasks. SROP statistics across feature maps are implemented as layer-wise useful information estimates. We design sanity checks to explore SROP variations when UI variations are produced by variations in model input, model architecture and training stages. The variations of SROP is synchronizes with UI variations in various randomized and sufficiently trained model structures. Therefore, SROP variations is an accurate and convenient sign of UI variations, which promotes the explainability of data representations with respect to frequency-domain knowledge.

Paper Structure

This paper contains 32 sections, 7 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: A Demo of LFP. Images contain useful information, and they are low-Nyqusit-frequency. Noise, where useful information is mixed with plenty of redundant information are high-low-Nyquist-frequency. SROP is used as the Nyquist frequency of useful information. The SROP threshold reflects the highest Nyquist frequency of transmissible information. Information whose Nyquist frequency is higher than the threshold cannot pass to the consecutive layers perfectly. The demo shows that heterogeneity of samples is a confounding factor in obtaining accurate useful information, but the variations of useful information are trackable. According to LFP, a decrease in layer-wise SROP indicates an increase in useful information.
  • Figure 2: Frog/digit images and SROP distributions. $w$ indicates the proportion of the frog image in synthesized images. The frog image is new useful information in CASE I and noise in CASE II when performing the digit classification task. Red plots are the mean SROPs' kernel density plots of the convolution when using the MNIST test set as model input. The models are well-trained. Vertical lines are mean SROPs when using the frog image as model input. Details of CASE I and II and the convolution can be found in Sec. \ref{['subsec:ui_var_input']}.
  • Figure 3: Numerical values of accuracy and mean SROP of frog patterns. As introduced in Sec. \ref{['subsec:ui_var_input']}, the frog patterns are useful information and noise in CASE I and II respectively. The mean SROP in feature maps stays low when model extracts sufficient useful information for high accuracy (CASE I). The model fails to yield ideal accuracy (CASE II), which is reflected in high SROPs.
  • Figure 4: Normalized SROPs in pooling layers and randomized strided-convolutions. The kernel sizes of the pooling layers in AlexNet and VGG are three and two, respectively. Triangles denote the SROP mean values. Dashed lines denote the SROP mean values from the benchmark (Sec. \ref{['secsec:max_baseline']}). Box plots contain median values and quartiles.
  • Figure 5: SROP mean values in randomized and pre-trained backbones. The y axis denotes $log (SROP)$. Anti-aliased models are pre-trained. The benchmark Pool is a max-pooling layer whose kernel size and stride are two. SROP curves with the ds prefix are computed by replacing intermediate and downscaling layers with the benchmark Pool.