Table of Contents
Fetching ...

Fourier-basis Functions to Bridge Augmentation Gap: Rethinking Frequency Augmentation in Image Classification

Puru Vaish, Shunxin Wang, Nicola Strisciuglio

TL;DR

The results show that AFA benefits the robustness of models against common corruptions, OOD generalization, and consistency of performance of models against increasing perturbations, with negligible deficit to the standard performance of models.

Abstract

Computer vision models normally witness degraded performance when deployed in real-world scenarios, due to unexpected changes in inputs that were not accounted for during training. Data augmentation is commonly used to address this issue, as it aims to increase data variety and reduce the distribution gap between training and test data. However, common visual augmentations might not guarantee extensive robustness of computer vision models. In this paper, we propose Auxiliary Fourier-basis Augmentation (AFA), a complementary technique targeting augmentation in the frequency domain and filling the augmentation gap left by visual augmentations. We demonstrate the utility of augmentation via Fourier-basis additive noise in a straightforward and efficient adversarial setting. Our results show that AFA benefits the robustness of models against common corruptions, OOD generalization, and consistency of performance of models against increasing perturbations, with negligible deficit to the standard performance of models. It can be seamlessly integrated with other augmentation techniques to further boost performance. Code and models can be found at: https://github.com/nis-research/afa-augment

Fourier-basis Functions to Bridge Augmentation Gap: Rethinking Frequency Augmentation in Image Classification

TL;DR

The results show that AFA benefits the robustness of models against common corruptions, OOD generalization, and consistency of performance of models against increasing perturbations, with negligible deficit to the standard performance of models.

Abstract

Computer vision models normally witness degraded performance when deployed in real-world scenarios, due to unexpected changes in inputs that were not accounted for during training. Data augmentation is commonly used to address this issue, as it aims to increase data variety and reduce the distribution gap between training and test data. However, common visual augmentations might not guarantee extensive robustness of computer vision models. In this paper, we propose Auxiliary Fourier-basis Augmentation (AFA), a complementary technique targeting augmentation in the frequency domain and filling the augmentation gap left by visual augmentations. We demonstrate the utility of augmentation via Fourier-basis additive noise in a straightforward and efficient adversarial setting. Our results show that AFA benefits the robustness of models against common corruptions, OOD generalization, and consistency of performance of models against increasing perturbations, with negligible deficit to the standard performance of models. It can be seamlessly integrated with other augmentation techniques to further boost performance. Code and models can be found at: https://github.com/nis-research/afa-augment
Paper Structure (28 sections, 3 theorems, 11 equations, 14 figures, 6 tables)

This paper contains 28 sections, 3 theorems, 11 equations, 14 figures, 6 tables.

Key Result

Lemma 1

Let $f$, $g$ be functions of a real variable and let $\mathscr{F}(f)$ and $\mathscr{F}(g)$ be their Fourier transforms. Then for complex numbers $a$ and $b$ therefore, Fourier transform $\mathscr{F}$ is a linear transformation.

Figures (14)

  • Figure 1: Frequency augmentation with Fourier-basis functions is complementary to common visual augmentations. They appear unnatural and can be used as adversarial examples.
  • Figure 2: Example of Fourier-basis functions added to natural images. They appear as gratings that obscure spatial information.
  • Figure 3: Schema of the AFA augmentation pipeline. The image $x$ is augmented using AFA, which adds a planar wave per channel $c$ of the image at a strength value $\sigma_c$ sampled from an exponential distribution (eq.\ref{['eqn:afa']}). The AFA augmented image $x^a$ is used for training, processed through the auxiliary component of the parallel batch normalisation layer (for models that use batch normalization to track batch statistics, e.g. ResNet). Other visual augmentations are applied in parallel, and used for training via the main component of the normalization layer. Finally, we train via optimizing two cross-entropy losses, one for the main and the other for the auxiliary component.
  • Figure 4: Relative error per corruption severity, computed by subtracting the classification error of models trained with PRIME, TrivialAugment, and AugMix with that of corresponding models trained with PRIME+AFA, TrivialAugment+AFA, and AugMix+AFA.
  • Figure 5: Fourier heatmaps of ResNet18 trained with standard setup, and PRIME and TrivialAugment, with and without AFA.
  • ...and 9 more figures

Theorems & Definitions (4)

  • Lemma 1: Linearity
  • Lemma 2: Fourier Transform of Plane Wave
  • Theorem 1: AFA Augments the Fourier Domain
  • proof