Table of Contents
Fetching ...

Navigating Noise: A Study of How Noise Influences Generalisation and Calibration of Neural Networks

Martin Ferianc, Ondrej Bohdal, Timothy Hospedales, Miguel Rodrigues

TL;DR

This work tackles the problem of improving neural network generalisation and confidence calibration under distribution shift by systematically evaluating a wide spectrum of noise injections. It introduces a unified methodology that treats multiple noise sources with controllable probabilities $p_{noise}^{i}$ and magnitudes $\\delta^{i}$ via placement-specific functions $\\alpha^i_{\\textrm{<place>}}$, allowing simultaneous, conditional application during training across $E$ epochs. Across CV, tabular, and NLP tasks, the study reveals that domain-specific noises (notably AugMix in CV) and damping/regularising noises (e.g., Dropout) commonly improve both ID and OOD performance, though transferability of hyperparameters across datasets and architectures is nuanced. Combining noises often yields better calibration and generalisation for classification, but careful tuning and budget considerations are essential due to potential negative interactions. The analysis of learnt representation landscapes shows that noise can smooth optimization with respect to prediction error while calibration landscapes remain more resistant, underscoring the need for domain-aware noise design and targeted hyperparameter transfer. Overall, the paper provides practical guidance and a robust framework for evaluating and combining noise injections to enhance generalisation and calibration in real-world tasks.

Abstract

Enhancing the generalisation abilities of neural networks (NNs) through integrating noise such as MixUp or Dropout during training has emerged as a powerful and adaptable technique. Despite the proven efficacy of noise in NN training, there is no consensus regarding which noise sources, types and placements yield maximal benefits in generalisation and confidence calibration. This study thoroughly explores diverse noise modalities to evaluate their impacts on NN's generalisation and calibration under in-distribution or out-of-distribution settings, paired with experiments investigating the metric landscapes of the learnt representations across a spectrum of NN architectures, tasks, and datasets. Our study shows that AugMix and weak augmentation exhibit cross-task effectiveness in computer vision, emphasising the need to tailor noise to specific domains. Our findings emphasise the efficacy of combining noises and successful hyperparameter transfer within a single domain but the difficulties in transferring the benefits to other domains. Furthermore, the study underscores the complexity of simultaneously optimising for both generalisation and calibration, emphasising the need for practitioners to carefully consider noise combinations and hyperparameter tuning for optimal performance in specific tasks and datasets.

Navigating Noise: A Study of How Noise Influences Generalisation and Calibration of Neural Networks

TL;DR

This work tackles the problem of improving neural network generalisation and confidence calibration under distribution shift by systematically evaluating a wide spectrum of noise injections. It introduces a unified methodology that treats multiple noise sources with controllable probabilities and magnitudes via placement-specific functions , allowing simultaneous, conditional application during training across epochs. Across CV, tabular, and NLP tasks, the study reveals that domain-specific noises (notably AugMix in CV) and damping/regularising noises (e.g., Dropout) commonly improve both ID and OOD performance, though transferability of hyperparameters across datasets and architectures is nuanced. Combining noises often yields better calibration and generalisation for classification, but careful tuning and budget considerations are essential due to potential negative interactions. The analysis of learnt representation landscapes shows that noise can smooth optimization with respect to prediction error while calibration landscapes remain more resistant, underscoring the need for domain-aware noise design and targeted hyperparameter transfer. Overall, the paper provides practical guidance and a robust framework for evaluating and combining noise injections to enhance generalisation and calibration in real-world tasks.

Abstract

Enhancing the generalisation abilities of neural networks (NNs) through integrating noise such as MixUp or Dropout during training has emerged as a powerful and adaptable technique. Despite the proven efficacy of noise in NN training, there is no consensus regarding which noise sources, types and placements yield maximal benefits in generalisation and confidence calibration. This study thoroughly explores diverse noise modalities to evaluate their impacts on NN's generalisation and calibration under in-distribution or out-of-distribution settings, paired with experiments investigating the metric landscapes of the learnt representations across a spectrum of NN architectures, tasks, and datasets. Our study shows that AugMix and weak augmentation exhibit cross-task effectiveness in computer vision, emphasising the need to tailor noise to specific domains. Our findings emphasise the efficacy of combining noises and successful hyperparameter transfer within a single domain but the difficulties in transferring the benefits to other domains. Furthermore, the study underscores the complexity of simultaneously optimising for both generalisation and calibration, emphasising the need for practitioners to carefully consider noise combinations and hyperparameter tuning for optimal performance in specific tasks and datasets.
Paper Structure (19 sections, 49 figures, 32 tables, 1 algorithm)

This paper contains 19 sections, 49 figures, 32 tables, 1 algorithm.

Figures (49)

  • Figure 1: In-domain evaluation of the differences in rankings compared to not using any noise.
  • Figure 2: Detailed in-domain performance of NNs trained with various noises across the five tasks.
  • Figure 3: OOD evaluation of the differences in rankings compared to not using any noise.
  • Figure 4: Detailed OOD performance of NNs trained with various noises across the four tasks.
  • Figure 5: Transfer of hyperparameters on in-domain (ID) data.
  • ...and 44 more figures