Investigating the Corruption Robustness of Image Classifiers with Random Lp-norm Corruptions

Georg Siedel; Weijia Shao; Silvia Vock; Andrey Morozov

Investigating the Corruption Robustness of Image Classifiers with Random Lp-norm Corruptions

Georg Siedel, Weijia Shao, Silvia Vock, Andrey Morozov

TL;DR

This work addresses the brittleness of image classifiers to real-world random corruptions by formalizing robustness under $L_p$ distances and introducing a scalable sampling method for random $p$-norm perturbations. It proposes the imperceptible Corruption Error ($ ext{iCE}$) and mean Corruption Error for $p$-norms ($ ext{mCE}_{L_p}$) as metrics, and demonstrates that training with combinations of $p$-norm corruptions substantially enhances corruption robustness beyond state-of-the-art augmentations. The study finds that robustness transfers across non-$L_0$ norms and to some real-world corruptions, with lower $p$ values generally yielding stronger benefits, while $L_0$ remains a special case. Practically, the results offer guidance for designing data augmentation pipelines that improve safety and reliability of vision systems in the presence of imperceptible and real-world distortions.

Abstract

Robustness is a fundamental property of machine learning classifiers required to achieve safety and reliability. In the field of adversarial robustness of image classifiers, robustness is commonly defined as the stability of a model to all input changes within a p-norm distance. However, in the field of random corruption robustness, variations observed in the real world are used, while p-norm corruptions are rarely considered. This study investigates the use of random p-norm corruptions to augment the training and test data of image classifiers. We evaluate the model robustness against imperceptible random p-norm corruptions and propose a novel robustness metric. We empirically investigate whether robustness transfers across different p-norms and derive conclusions on which p-norm corruptions a model should be trained and evaluated. We find that training data augmentation with a combination of p-norm corruptions significantly improves corruption robustness, even on top of state-of-the-art data augmentation schemes.

Investigating the Corruption Robustness of Image Classifiers with Random Lp-norm Corruptions

TL;DR

This work addresses the brittleness of image classifiers to real-world random corruptions by formalizing robustness under

distances and introducing a scalable sampling method for random

-norm perturbations. It proposes the imperceptible Corruption Error (

) and mean Corruption Error for

-norms (

) as metrics, and demonstrates that training with combinations of

-norm corruptions substantially enhances corruption robustness beyond state-of-the-art augmentations. The study finds that robustness transfers across non-

norms and to some real-world corruptions, with lower

values generally yielding stronger benefits, while

remains a special case. Practically, the results offer guidance for designing data augmentation pipelines that improve safety and reliability of vision systems in the presence of imperceptible and real-world distortions.

Abstract

Paper Structure (13 sections, 4 equations, 6 figures, 6 tables)

This paper contains 13 sections, 4 equations, 6 figures, 6 tables.

INTRODUCTION
Motivation
Contributions
PRELIMINARIES
Robustness definition
Sampling algorithm
RELATED WORK
EXPERIMENTAL SETUP
Robustness Metrics
Training Setup
RESULTS
DISCUSSION
CONCLUSION

Figures (6)

Figure 1: Samples drawn uniformly in 2D from a $L_{0.5}$ and a $L_{10}$ norm sphere (left and right) and a $L_1$ and a $L_2$ norm ball (middle).
Figure 2: Examples from the chosen set of imperceptible corruptions on CIFAR (above) and TinyImageNet (below)
Figure 3: Normalized accuracy when training and testing on different $p$-norm corruptions. For each test corruption (see $mCE_{L_p}$ in Table \ref{['corr-sets']}), the accuracies of all models trained on $p$-norm corruptions (without additional data augmentation strategies) are first normalized so that the best model achieves 100% accuracy. Then the average accuracy is calculated across all model architectures and datasets as well as across all $\epsilon$-values of the same $p$-norm for training and testing. This visualizes how, on average, training on one $p$-norm leads to robustness against all $p$-norm corruptions. The prior normalization makes the trained models comparable.
Figure 4: Clean Error vs. $mCE$ plot for selected models on Tiny Imagenet. The green arrows indicate an improvement of both metrics when the model is trained on $p$-norm corruption combinations.
Figure 5: The frequency of 1000 samples drawn from inside a first CIFAR-10-dimensional $p$-norm ball also being part of a second $L_2$-norm ball of $\epsilon=4$ (blue plot), as well as the frequency of 1000 samples drawn from inside the second norm ball also being part of the first norm ball (orange plot).
...and 1 more figures

Investigating the Corruption Robustness of Image Classifiers with Random Lp-norm Corruptions

TL;DR

Abstract

Investigating the Corruption Robustness of Image Classifiers with Random Lp-norm Corruptions

Authors

TL;DR

Abstract

Table of Contents

Figures (6)