Table of Contents
Fetching ...

An Empirical Study of Accuracy-Robustness Tradeoff and Training Efficiency in Self-Supervised Learning

Fatemeh Ghofrani, Pooyan Jamshidi

TL;DR

This work addresses the efficiency and robustness of self-supervised learning (SSL) under adversarial perturbations by revisiting EMP-SSL and introducing CF-AMC-SSL, a cost-efficient SSL method that uses aggressive multi-crop augmentation combined with free adversarial training. It demonstrates that increasing the number of crops per image can compensate for fewer training epochs, achieving fast convergence while maintaining or improving clean accuracy and adversarial robustness, outperforming robust SimCLR. The study provides extensive experiments on CIFAR-10/100 (and ImageNet-100) with ResNet backbones, showing crop-based EMP-SSL generally offers a better accuracy-robustness tradeoff, and that CF-AMC-SSL can reduce training time by orders of magnitude with competitive performance. Public code is provided to facilitate adoption and further research in robust SSL applications.

Abstract

Self-supervised learning (SSL) has significantly advanced image representation learning, yet efficiency challenges persist, particularly with adversarial training. Many SSL methods require extensive epochs to achieve convergence, a demand further amplified in adversarial settings. To address this inefficiency, we revisit the robust EMP-SSL framework, emphasizing the importance of increasing the number of crops per image to accelerate learning. Unlike traditional contrastive learning, robust EMP-SSL leverages multi-crop sampling, integrates an invariance term and regularization, and reduces training epochs, enhancing time efficiency. Evaluated with both standard linear classifiers and multi-patch embedding aggregation, robust EMP-SSL provides new insights into SSL evaluation strategies. Our results show that robust crop-based EMP-SSL not only accelerates convergence but also achieves a superior balance between clean accuracy and adversarial robustness, outperforming multi-crop embedding aggregation. Additionally, we extend this approach with free adversarial training in Multi-Crop SSL, introducing the Cost-Free Adversarial Multi-Crop Self-Supervised Learning (CF-AMC-SSL) method. CF-AMC-SSL demonstrates the effectiveness of free adversarial training in reducing training time while simultaneously improving clean accuracy and adversarial robustness. These findings underscore the potential of CF-AMC-SSL for practical SSL applications. Our code is publicly available at https://github.com/softsys4ai/CF-AMC-SSL.

An Empirical Study of Accuracy-Robustness Tradeoff and Training Efficiency in Self-Supervised Learning

TL;DR

This work addresses the efficiency and robustness of self-supervised learning (SSL) under adversarial perturbations by revisiting EMP-SSL and introducing CF-AMC-SSL, a cost-efficient SSL method that uses aggressive multi-crop augmentation combined with free adversarial training. It demonstrates that increasing the number of crops per image can compensate for fewer training epochs, achieving fast convergence while maintaining or improving clean accuracy and adversarial robustness, outperforming robust SimCLR. The study provides extensive experiments on CIFAR-10/100 (and ImageNet-100) with ResNet backbones, showing crop-based EMP-SSL generally offers a better accuracy-robustness tradeoff, and that CF-AMC-SSL can reduce training time by orders of magnitude with competitive performance. Public code is provided to facilitate adoption and further research in robust SSL applications.

Abstract

Self-supervised learning (SSL) has significantly advanced image representation learning, yet efficiency challenges persist, particularly with adversarial training. Many SSL methods require extensive epochs to achieve convergence, a demand further amplified in adversarial settings. To address this inefficiency, we revisit the robust EMP-SSL framework, emphasizing the importance of increasing the number of crops per image to accelerate learning. Unlike traditional contrastive learning, robust EMP-SSL leverages multi-crop sampling, integrates an invariance term and regularization, and reduces training epochs, enhancing time efficiency. Evaluated with both standard linear classifiers and multi-patch embedding aggregation, robust EMP-SSL provides new insights into SSL evaluation strategies. Our results show that robust crop-based EMP-SSL not only accelerates convergence but also achieves a superior balance between clean accuracy and adversarial robustness, outperforming multi-crop embedding aggregation. Additionally, we extend this approach with free adversarial training in Multi-Crop SSL, introducing the Cost-Free Adversarial Multi-Crop Self-Supervised Learning (CF-AMC-SSL) method. CF-AMC-SSL demonstrates the effectiveness of free adversarial training in reducing training time while simultaneously improving clean accuracy and adversarial robustness. These findings underscore the potential of CF-AMC-SSL for practical SSL applications. Our code is publicly available at https://github.com/softsys4ai/CF-AMC-SSL.
Paper Structure (21 sections, 5 figures, 6 tables, 1 algorithm)

This paper contains 21 sections, 5 figures, 6 tables, 1 algorithm.

Figures (5)

  • Figure 1: Illustration of workflow comparison
  • Figure 2: Evaluation of robustness against PGD attacks through adversarial pretraining on CIFAR-10 and CIFAR-100 datasets. We compare the performance of robust SimCLR and robust EMP-SSL with central crop evaluation under different training configurations. Our analysis includes the evaluation of patch-based SimCLR with varying patch sizes and baseline SimCLR, revealing a noticeable trade-off between clean accuracy and robustness. Larger patch sizes in robust SimCLR improve robustness but reduce clean accuracy. Additionally, we compare crop-based EMP-SSL (with varying crop sizes) to baseline EMP-SSL, demonstrating that the crop-based approach significantly enhances robustness. Notably, Robust EMP-SSL achieves a superior balance between clean accuracy and robustness compared to robust SimCLR. The variables $S$ and $R$ correspond to the scales and ratios used in the PyTorch framework’s RandomResizedCrop method.
  • Figure 3: Evaluating the robustness against PGD attacks through adversarial pretraining on CIFAR-10 and CIFAR-100 datasets, we compare the performance of patch-based SimCLR (with various patch sizes) to that of baseline SimCLR. Our findings reveal a noticeable trade-off between clean accuracy and robustness. In addition, central cropping (first column) demonstrates higher efficiency in terms of overall complexity, clean accuracy, and robustness. Moreover, increasing patch sizes reduces clean accuracy but improves model robustness. Note that the variables $S$ and $R$ correspond to the scales and ratios employed in the PyTorch framework's RandomResizedCrop method.
  • Figure 4: Evaluating the robustness against PGD attacks through adversarial pretraining on CIFAR-10 and CIFAR-100 datasets, we compare the performance of crop-based EMP-SSL (with various crop sizes) to that of baseline EMP-SSL. Our analysis reveals that the crop-based approach in EMP-SSL demonstrates enhanced robustness. Compared to the results presented in Figure \ref{['eval-simclr']}, it is clear that Robust EMP-SSL achieves a superior balance between clean accuracy and robustness, in contrast to robust SimCLR. Here, the variables $s$ and $r$ denote the scales and ratios utilized for the RandomResizedCrop method within the PyTorch framework.
  • Figure 5: Evaluation of robust EMP-SSL across different patch (crop) sizes on CIFAR-10 and CIFAR-100 datasets: Our results emphasize that, when employing the patch-based EMP-SSL method with multi-patch aggregation during evaluation, a significant augmentation in the number of patches leads to a noticeable enhancement in clean accuracy. Furthermore, when using crop-based EMP-SSL with central-crop assessment, a more equitable balance between clean accuracy and model robustness can be established, especially evident with a moderate number of crops, such as 16. Note that "Crop-based (4)" means augmentation with scales (S) of (0.08, 1.0) and ratios (R) of (0.75, 1.3), with (4) denoting the number of crops. Similarly, "Patch-based (4)" involves scales (S) of (0.25, 0.25) and ratios (R) of (1.0, 1.0), with (4) representing the number of patches.