Enhancing Adversarial Robustness of Deep Neural Networks Through Supervised Contrastive Learning
Longwei Wang, Navid Nayyem, Abdullah Rakin
TL;DR
This work addresses the fragility of CNN feature representations under adversarial perturbations by proposing a framework that combines supervised contrastive learning with a margin-based contrastive loss. The method jointly optimizes a supervised contrastive objective and a cross-entropy loss (and, in a variant, a margin-based version) to produce structured, class-discriminative feature spaces with robust decision boundaries. Empirical results on CIFAR-100 with a ResNet-18 backbone show that supervised contrastive learning improves FGSM robustness, while margin-based constraints further stabilize boundaries and enhance adversarial resilience, particularly when data augmentation is minimized or avoided. The approach offers a scalable, efficient alternative to heavy adversarial training and has potential for integration with other defenses in real-world applications.
Abstract
Adversarial attacks exploit the vulnerabilities of convolutional neural networks by introducing imperceptible perturbations that lead to misclassifications, exposing weaknesses in feature representations and decision boundaries. This paper presents a novel framework combining supervised contrastive learning and margin-based contrastive loss to enhance adversarial robustness. Supervised contrastive learning improves the structure of the feature space by clustering embeddings of samples within the same class and separating those from different classes. Margin-based contrastive loss, inspired by support vector machines, enforces explicit constraints to create robust decision boundaries with well-defined margins. Experiments on the CIFAR-100 dataset with a ResNet-18 backbone demonstrate robustness performance improvements in adversarial accuracy under Fast Gradient Sign Method attacks.
