Table of Contents
Fetching ...

Adaptative Context Normalization: A Boost for Deep Learning in Image Processing

Bilal Faye, Hanane Azzag, Mustapha Lebbah, Djamel Bouchaffra

TL;DR

Adaptative Context Normalization (ACN) tackles distribution shifts across layers in image-processing networks by introducing context-based normalization, where per-context parameters $\{\mu_r, \sigma_r\}$ are learned during training. By treating activations as arising from predefined contexts, ACN enables context-specific normalization and optional mixture-aggregation at inference, offering faster convergence and better generalization than Batch Normalization and Mixture Normalization. The approach demonstrates strong gains across CNNs and Vision Transformers, including improvements in CIFAR-100 using superclass contexts and substantial domain-adaptation benefits when paired with AdaMatch. The work also outlines a path toward an unsupervised variant that discovers contexts online, aimed at further boosting convergence and robustness in diverse image-processing tasks.

Abstract

Deep Neural network learning for image processing faces major challenges related to changes in distribution across layers, which disrupt model convergence and performance. Activation normalization methods, such as Batch Normalization (BN), have revolutionized this field, but they rely on the simplified assumption that data distribution can be modelled by a single Gaussian distribution. To overcome these limitations, Mixture Normalization (MN) introduced an approach based on a Gaussian Mixture Model (GMM), assuming multiple components to model the data. However, this method entails substantial computational requirements associated with the use of Expectation-Maximization algorithm to estimate parameters of each Gaussian components. To address this issue, we introduce Adaptative Context Normalization (ACN), a novel supervised approach that introduces the concept of "context", which groups together a set of data with similar characteristics. Data belonging to the same context are normalized using the same parameters, enabling local representation based on contexts. For each context, the normalized parameters, as the model weights are learned during the backpropagation phase. ACN not only ensures speed, convergence, and superior performance compared to BN and MN but also presents a fresh perspective that underscores its particular efficacy in the field of image processing.

Adaptative Context Normalization: A Boost for Deep Learning in Image Processing

TL;DR

Adaptative Context Normalization (ACN) tackles distribution shifts across layers in image-processing networks by introducing context-based normalization, where per-context parameters are learned during training. By treating activations as arising from predefined contexts, ACN enables context-specific normalization and optional mixture-aggregation at inference, offering faster convergence and better generalization than Batch Normalization and Mixture Normalization. The approach demonstrates strong gains across CNNs and Vision Transformers, including improvements in CIFAR-100 using superclass contexts and substantial domain-adaptation benefits when paired with AdaMatch. The work also outlines a path toward an unsupervised variant that discovers contexts online, aimed at further boosting convergence and robustness in diverse image-processing tasks.

Abstract

Deep Neural network learning for image processing faces major challenges related to changes in distribution across layers, which disrupt model convergence and performance. Activation normalization methods, such as Batch Normalization (BN), have revolutionized this field, but they rely on the simplified assumption that data distribution can be modelled by a single Gaussian distribution. To overcome these limitations, Mixture Normalization (MN) introduced an approach based on a Gaussian Mixture Model (GMM), assuming multiple components to model the data. However, this method entails substantial computational requirements associated with the use of Expectation-Maximization algorithm to estimate parameters of each Gaussian components. To address this issue, we introduce Adaptative Context Normalization (ACN), a novel supervised approach that introduces the concept of "context", which groups together a set of data with similar characteristics. Data belonging to the same context are normalized using the same parameters, enabling local representation based on contexts. For each context, the normalized parameters, as the model weights are learned during the backpropagation phase. ACN not only ensures speed, convergence, and superior performance compared to BN and MN but also presents a fresh perspective that underscores its particular efficacy in the field of image processing.
Paper Structure (12 sections, 11 equations, 3 figures, 2 tables, 1 algorithm)

This paper contains 12 sections, 11 equations, 3 figures, 2 tables, 1 algorithm.

Figures (3)

  • Figure 2: Comparing Training and Validation Error Curves: ACN-base and ACN in ViT Architecture on CIFAR-100 show faster convergence and lower validation loss, enhancing learning efficiency and classification compared to BN.
  • Figure 3: Gradient Variance Evolution: AdaMatch and AdaMatch+ACN models during training on source (MNIST) and target (SVHN) domains. Left: Max gradient variance per epoch. Right: Average gradient variance per epoch.
  • Figure : Learning rate = 0.001