Unsupervised Adaptive Normalization

Bilal Faye; Hanane Azzag; Mustapha Lebbah; Fangchen Fang

Unsupervised Adaptive Normalization

Bilal Faye, Hanane Azzag, Mustapha Lebbah, Fangchen Fang

TL;DR

Unsupervised Adaptive Normalization (UAN) is introduced, an innovative algorithm that seamlessly integrates clustering for normalization with deep neural network learning in a singular process and enhances gradient stability, resulting in faster learning and augmented neural network performance.

Abstract

Deep neural networks have become a staple in solving intricate problems, proving their mettle in a wide array of applications. However, their training process is often hampered by shifting activation distributions during backpropagation, resulting in unstable gradients. Batch Normalization (BN) addresses this issue by normalizing activations, which allows for the use of higher learning rates. Despite its benefits, BN is not without drawbacks, including its dependence on mini-batch size and the presumption of a uniform distribution of samples. To overcome this, several alternatives have been proposed, such as Layer Normalization, Group Normalization, and Mixture Normalization. These methods may still struggle to adapt to the dynamic distributions of neuron activations during the learning process. To bridge this gap, we introduce Unsupervised Adaptive Normalization (UAN), an innovative algorithm that seamlessly integrates clustering for normalization with deep neural network learning in a singular process. UAN executes clustering using the Gaussian mixture model, determining parameters for each identified cluster, by normalizing neuron activations. These parameters are concurrently updated as weights in the deep neural network, aligning with the specific requirements of the target task during backpropagation. This unified approach of clustering and normalization, underpinned by neuron activation normalization, fosters an adaptive data representation that is specifically tailored to the target task. This adaptive feature of UAN enhances gradient stability, resulting in faster learning and augmented neural network performance. UAN outperforms the classical methods by adapting to the target task and is effective in classification, and domain adaptation.

Unsupervised Adaptive Normalization

TL;DR

Abstract

Paper Structure (11 sections, 11 equations, 5 figures, 2 tables, 1 algorithm)

This paper contains 11 sections, 11 equations, 5 figures, 2 tables, 1 algorithm.

Introduction
State of the art
Contribution: Unsupervised Adaptive Normalization
Experiments
Datasets
Unsupervised Adaptive Normalization on Classification task
Shallow Convolutional Neural Network
Deep Convolutional Neural Network
Unsupervised Adaptive Normalization on Domain Adaptation
Computational Complexity
Conclusion

Figures (5)

Figure 1: Unsupervised Adaptive Normalization (UAN) applied on neuron activation $x_i$. The parameter $K$ corresponds to the number of clusters estimated during the training process. In the forward pass, the $K$ cluster parameters $\{\lambda_k, \mu_k, \sigma_k^2\}$ are used to normalize the activation $x_i$ through Equation \ref{['mn_aggregation']}. The resulting normalized activation serves as input for the Deep Neural Network (DNN). After loss computation, in the backward pass, the parameters $\{\lambda_k, \mu_k, \sigma_k: k = 1, ..., T\}$ undergo updates based on the target task.
Figure 2: t-SNE visualization of activations in latent space after 25, 50, 70 and 100 training epochs. Unsupervised Adaptive Normalization (UAN) on CIFAR-10, showing the formation and refinement of class-specific clusters over training epochs.
Figure 5: Evolution of the gradient variance during the training of AdaMatch and AdaMatch+UAN models on the source (MNIST) and target (SVHN) domains. The figures on the left correspond to the maximum gradient variance for each epoch, while the figures on the right correspond to the average gradient variance per epoch.
Figure : Learning rate = 0.001
Figure : DenseNet-40

Unsupervised Adaptive Normalization

TL;DR

Abstract

Unsupervised Adaptive Normalization

Authors

TL;DR

Abstract

Table of Contents

Figures (5)