Label Augmentation for Neural Networks Robustness

Fatemeh Amerehi; Patrick Healy

Label Augmentation for Neural Networks Robustness

Fatemeh Amerehi, Patrick Healy

TL;DR

This work tackles the challenge of out-of-distribution robustness in neural networks by addressing both common corruptions and adversarial perturbations. The authors propose Label Augmentation (LA), which enriches the training signal by concatenating the original class label with an augmentation-specific label, yielding a combined output space and enforcing invariance to class identity while differentiating augmentation effects. Empirically, LA improves clean accuracy, calibration (ECE/RMS), and robustness to FGSM and PGD attacks across multiple architectures and datasets (CIFAR-10/100 and CIFAR-10-C/CIFAR-100-C), often outperforming conventional augmentation methods and even rivaling adversarial training under certain budgets. The findings suggest LA as a simple, flexible technique with potential applicability beyond images, offering a practical path to more reliable and trustworthy models under distributional shifts.

Abstract

Out-of-distribution generalization can be categorized into two types: common perturbations arising from natural variations in the real world and adversarial perturbations that are intentionally crafted to deceive neural networks. While deep neural networks excel in accuracy under the assumption of identical distributions between training and test data, they often encounter out-of-distribution scenarios resulting in a significant decline in accuracy. Data augmentation methods can effectively enhance robustness against common corruptions, but they typically fall short in improving robustness against adversarial perturbations. In this study, we develop Label Augmentation (LA), which enhances robustness against both common and intentional perturbations and improves uncertainty estimation. Our findings indicate a Clean error rate improvement of up to 23.29% when employing LA in comparisons to the baseline. Additionally, it enhances robustness under common corruptions benchmark by up to 24.23%. When tested against FGSM and PGD attacks, improvements in adversarial robustness are noticeable, with enhancements of up to 53.18% for FGSM and 24.46% for PGD attacks.

Label Augmentation for Neural Networks Robustness

TL;DR

Abstract

Paper Structure (9 sections, 3 equations, 4 figures, 12 tables)

This paper contains 9 sections, 3 equations, 4 figures, 12 tables.

Introduction
Related Works
Label Augmentation
Experimental Setup
configurations and metrics
Results
Conclusion
Acknowledgement
Appendix

Figures (4)

Figure 1: What do you see when looking at the images?
Figure 2: The Cifar10 dataset includes 10 classes representing airplanes, cars, birds, cats, deer, dogs, frogs, horses, ships, and trucks. The one-hot label for horses is [0 0 0 0 0 0 0 1 0 0]. Considering three distinct augmentation operation classes like contrast, noise, and blur; the one-hot label for noise is [0 1 0]. In standard augmentation, labels remain invariant. When applying Label Augmentation with a smoothing factor $\delta$, the resulting label for noisy image of a horse is [0 0 0 0 0 0 0 $1- \delta$ 0 0 0 $\delta$ 0]. This maintains invariance with original categories while distinguishing between more abstract concepts, such as noisy and noise-free inputs.
Figure 3: Examples of augmentation operations applied in Label Augmentation.
Figure 4: Percentages of error rate variations compared to the standard training on Wide ResNet-50: left side employing LA, right side with normal augmentations. While normal augmentation can enhance mCE to a considerable degree, it comes at the expense of Clean and calibration errors. On the other hand, regardless of the type and the number of operations used in augmenting with LA, we can see improvements in Clean, mCE, calibration, and adversarial errors. However, using two or three types of operations proves even more effective.

Label Augmentation for Neural Networks Robustness

TL;DR

Abstract

Label Augmentation for Neural Networks Robustness

Authors

TL;DR

Abstract

Table of Contents

Figures (4)