Table of Contents
Fetching ...

Covariance-corrected Whitening Alleviates Network Degeneration on Imbalanced Classification

Zhiwei Zhang

TL;DR

Two covariance-corrected modules are proposed, the Group-based Relatively Balanced Batch Sampler (GRBS) and the Batch Embedded Training (BET), to get more accurate and stable batch covariance, thereby reinforcing the capability of whitening.

Abstract

Class imbalance is a critical issue in image classification that significantly affects the performance of deep recognition models. In this work, we first identify a network degeneration dilemma that hinders the model learning by introducing a high linear dependence among the features inputted into the classifier. To overcome this challenge, we propose a novel framework called Whitening-Net to mitigate the degenerate solutions, in which ZCA whitening is integrated before the linear classifier to normalize and decorrelate the batch samples. However, in scenarios with extreme class imbalance, the batch covariance statistic exhibits significant fluctuations, impeding the convergence of the whitening operation. Therefore, we propose two covariance-corrected modules, the Group-based Relatively Balanced Batch Sampler (GRBS) and the Batch Embedded Training (BET), to get more accurate and stable batch covariance, thereby reinforcing the capability of whitening. Our modules can be trained end-to-end without incurring substantial computational costs. Comprehensive empirical evaluations conducted on benchmark datasets, including CIFAR-LT-10/100, ImageNet-LT, and iNaturalist-LT, validate the effectiveness of our proposed approaches.

Covariance-corrected Whitening Alleviates Network Degeneration on Imbalanced Classification

TL;DR

Two covariance-corrected modules are proposed, the Group-based Relatively Balanced Batch Sampler (GRBS) and the Batch Embedded Training (BET), to get more accurate and stable batch covariance, thereby reinforcing the capability of whitening.

Abstract

Class imbalance is a critical issue in image classification that significantly affects the performance of deep recognition models. In this work, we first identify a network degeneration dilemma that hinders the model learning by introducing a high linear dependence among the features inputted into the classifier. To overcome this challenge, we propose a novel framework called Whitening-Net to mitigate the degenerate solutions, in which ZCA whitening is integrated before the linear classifier to normalize and decorrelate the batch samples. However, in scenarios with extreme class imbalance, the batch covariance statistic exhibits significant fluctuations, impeding the convergence of the whitening operation. Therefore, we propose two covariance-corrected modules, the Group-based Relatively Balanced Batch Sampler (GRBS) and the Batch Embedded Training (BET), to get more accurate and stable batch covariance, thereby reinforcing the capability of whitening. Our modules can be trained end-to-end without incurring substantial computational costs. Comprehensive empirical evaluations conducted on benchmark datasets, including CIFAR-LT-10/100, ImageNet-LT, and iNaturalist-LT, validate the effectiveness of our proposed approaches.
Paper Structure (22 sections, 3 equations, 9 figures, 10 tables, 1 algorithm)

This paper contains 22 sections, 3 equations, 9 figures, 10 tables, 1 algorithm.

Figures (9)

  • Figure 1: The proposed end-to-end training framework for imbalanced classification. The proposed system includes ZCA whitening on the features before being fed into the classifier, and two covariance-corrected modules, Group-based Relatively Balanced Batch Sampler (GRBS) and Batch Embedded Training (BET).
  • Figure 2: The visualizations of feature distribution before being fed into classifier. The top row figures show the correlation coefficients between channel-wised features. The bottom row figures illustrate the singular value histograms of features. The X-axis represents the singular value, the Y-axis represents epoch, and the Z-axis is the frequency. The first, middle and right columns present the results obtained by training the neural networks on balanced CIFAR-10, imbalanced CIFAR-10 without and with whitening, respectively. We can see that the main difference between the balanced and imbalanced tasks is that features learned on imbalanced dataset are more correlated than those on the balanced dataset, e.g., higher correlation coefficients and more singular values are nearly zero.
  • Figure 3: The visualization of batch covariance of last-layer hidden features before being fed into the classifier. The experiments are constructed on CIFAR-100-LT dataset using ResNet-32. We use "BET" to represent the proposed GRBS and BET approaches.
  • Figure 5: Singular value histograms of features on different layers (The sub-figtures of (a), (b) from left to right are: Layer_1, Layer_2, Layer_3 and Layer_p, where "p" denotes pooling. The last sub-figure on (c) is Layer_p after whitening transformation.) of ResNet-32 using end-to-end training. The first, middle and bottom rows present the results on balanced CIFAR-10, imbalanced CIFAR-10 and imbalanced CIFAR-10 with whitening, respectively. The vertical axis in each figure stands for the training epoch. We can see that the main difference between the balanced and imbalanced tasks is that a large amount of the singular values of the features fed into classifier (i.e., the last column) in the imbalanced task are nearly zero, which implies that these features are highly correlated. The bottom row demonstrates that our whitening can effectively decorrelate these features since the features have more large singular values.
  • Figure 6: The correlation coefficients between channel-wised features fed into the classifier at the last epochs. The experiments are constructed on CIFAR-100-LT datasets using ResNet-32.
  • ...and 4 more figures