Table of Contents
Fetching ...

Enhancing Adversarial Robustness of Deep Neural Networks Through Supervised Contrastive Learning

Longwei Wang, Navid Nayyem, Abdullah Rakin

TL;DR

This work addresses the fragility of CNN feature representations under adversarial perturbations by proposing a framework that combines supervised contrastive learning with a margin-based contrastive loss. The method jointly optimizes a supervised contrastive objective and a cross-entropy loss (and, in a variant, a margin-based version) to produce structured, class-discriminative feature spaces with robust decision boundaries. Empirical results on CIFAR-100 with a ResNet-18 backbone show that supervised contrastive learning improves FGSM robustness, while margin-based constraints further stabilize boundaries and enhance adversarial resilience, particularly when data augmentation is minimized or avoided. The approach offers a scalable, efficient alternative to heavy adversarial training and has potential for integration with other defenses in real-world applications.

Abstract

Adversarial attacks exploit the vulnerabilities of convolutional neural networks by introducing imperceptible perturbations that lead to misclassifications, exposing weaknesses in feature representations and decision boundaries. This paper presents a novel framework combining supervised contrastive learning and margin-based contrastive loss to enhance adversarial robustness. Supervised contrastive learning improves the structure of the feature space by clustering embeddings of samples within the same class and separating those from different classes. Margin-based contrastive loss, inspired by support vector machines, enforces explicit constraints to create robust decision boundaries with well-defined margins. Experiments on the CIFAR-100 dataset with a ResNet-18 backbone demonstrate robustness performance improvements in adversarial accuracy under Fast Gradient Sign Method attacks.

Enhancing Adversarial Robustness of Deep Neural Networks Through Supervised Contrastive Learning

TL;DR

This work addresses the fragility of CNN feature representations under adversarial perturbations by proposing a framework that combines supervised contrastive learning with a margin-based contrastive loss. The method jointly optimizes a supervised contrastive objective and a cross-entropy loss (and, in a variant, a margin-based version) to produce structured, class-discriminative feature spaces with robust decision boundaries. Empirical results on CIFAR-100 with a ResNet-18 backbone show that supervised contrastive learning improves FGSM robustness, while margin-based constraints further stabilize boundaries and enhance adversarial resilience, particularly when data augmentation is minimized or avoided. The approach offers a scalable, efficient alternative to heavy adversarial training and has potential for integration with other defenses in real-world applications.

Abstract

Adversarial attacks exploit the vulnerabilities of convolutional neural networks by introducing imperceptible perturbations that lead to misclassifications, exposing weaknesses in feature representations and decision boundaries. This paper presents a novel framework combining supervised contrastive learning and margin-based contrastive loss to enhance adversarial robustness. Supervised contrastive learning improves the structure of the feature space by clustering embeddings of samples within the same class and separating those from different classes. Margin-based contrastive loss, inspired by support vector machines, enforces explicit constraints to create robust decision boundaries with well-defined margins. Experiments on the CIFAR-100 dataset with a ResNet-18 backbone demonstrate robustness performance improvements in adversarial accuracy under Fast Gradient Sign Method attacks.
Paper Structure (23 sections, 5 equations, 12 figures)

This paper contains 23 sections, 5 equations, 12 figures.

Figures (12)

  • Figure 1: Supervised Contrastive Learning for enhanced robustness: neural network architecture with shared weights, residual blocks, projection heads, and loss functions. The architecture consists of two parallel streams processing input images A and B, with shared weights across residual blocks. Each stream outputs two projection vectors, which are used for computing Cross Entropy Loss and Supervised Contrastive Loss.
  • Figure 2: Baseline Model Loss for CIFAR100 with data augmentation. This figure illustrates the training loss for the baseline model over 200 epochs.
  • Figure 3: Baseline Model Loss Without Data Augmentation for CIFAR100. This figure shows the impact of excluding data augmentation during training.
  • Figure 4: Baseline Model Loss with Margin SCL for CIFAR100. This figure highlights the baseline loss when using margin SCL.
  • Figure 5: Baseline Model Loss with Refined SCL for CIFAR100. This figure illustrates the baseline model loss with refinements using the SCL approach.
  • ...and 7 more figures