Table of Contents
Fetching ...

Robust Fine-Tuning from Non-Robust Pretrained Models: Mitigating Suboptimal Transfer With Epsilon-Scheduling

Jonas Ngnawé, Maxime Heuillet, Sabyasachi Sahoo, Yann Pequignot, Ola Ahmad, Audrey Durand, Frédéric Precioso, Christian Gagné

TL;DR

It is found that fine-tuning using a robust objective impedes task adaptation at the beginning of training and eventually prevents optimal transfer, and a novel heuristic is proposed, Epsilon-Scheduling, a schedule over perturbation strength used during training that promotes optimal transfer.

Abstract

Fine-tuning pretrained models is a standard and effective workflow in modern machine learning. However, robust fine-tuning (RFT), which aims to simultaneously achieve adaptation to a downstream task and robustness to adversarial examples, remains challenging. Despite the abundance of non-robust pretrained models in open-source repositories, their potential for RFT is less understood. We address this knowledge gap by systematically examining RFT from such non-robust models. Our experiments reveal that fine-tuning non-robust models with a robust objective, even under small perturbations, can lead to poor performance, a phenomenon that we dub suboptimal transfer. In challenging scenarios (eg, difficult tasks, high perturbation), the resulting performance can be so low that it may be considered a transfer failure. We find that fine-tuning using a robust objective impedes task adaptation at the beginning of training and eventually prevents optimal transfer. However, we propose a novel heuristic, Epsilon-Scheduling, a schedule over perturbation strength used during training that promotes optimal transfer. Additionally, we introduce expected robustness, a metric that captures performance across a range of perturbations, providing a more comprehensive evaluation of the accuracy-robustness trade-off for diverse models at test time. Extensive experiments on a wide range of configurations (six pretrained models and five datasets) show that Epsilon-Scheduling successfully prevents suboptimal transfer and consistently improves expected robustness.

Robust Fine-Tuning from Non-Robust Pretrained Models: Mitigating Suboptimal Transfer With Epsilon-Scheduling

TL;DR

It is found that fine-tuning using a robust objective impedes task adaptation at the beginning of training and eventually prevents optimal transfer, and a novel heuristic is proposed, Epsilon-Scheduling, a schedule over perturbation strength used during training that promotes optimal transfer.

Abstract

Fine-tuning pretrained models is a standard and effective workflow in modern machine learning. However, robust fine-tuning (RFT), which aims to simultaneously achieve adaptation to a downstream task and robustness to adversarial examples, remains challenging. Despite the abundance of non-robust pretrained models in open-source repositories, their potential for RFT is less understood. We address this knowledge gap by systematically examining RFT from such non-robust models. Our experiments reveal that fine-tuning non-robust models with a robust objective, even under small perturbations, can lead to poor performance, a phenomenon that we dub suboptimal transfer. In challenging scenarios (eg, difficult tasks, high perturbation), the resulting performance can be so low that it may be considered a transfer failure. We find that fine-tuning using a robust objective impedes task adaptation at the beginning of training and eventually prevents optimal transfer. However, we propose a novel heuristic, Epsilon-Scheduling, a schedule over perturbation strength used during training that promotes optimal transfer. Additionally, we introduce expected robustness, a metric that captures performance across a range of perturbations, providing a more comprehensive evaluation of the accuracy-robustness trade-off for diverse models at test time. Extensive experiments on a wide range of configurations (six pretrained models and five datasets) show that Epsilon-Scheduling successfully prevents suboptimal transfer and consistently improves expected robustness.

Paper Structure

This paper contains 24 sections, 4 equations, 12 figures, 12 tables.

Figures (12)

  • Figure 1: RFT can lead to suboptimal transfer even when optimizing for small perturbation strenghts ($\varepsilon_g$). The severity is highly model- and dataset-dependent.
  • Figure 2: Epsilon-Scheduling
  • Figure 3: RFT delays task adaptation. Validation clean accuracy under standard fine-tuning ($\varepsilon_g=0$) and RFT-fix with $\varepsilon_g \in [1/255, 9/255]$ on three datasets.The crosses indicate the onset of task adaptation (when validation accuracy exceeds $5\%$). Stronger perturbations cause longer delays and more severe suboptimal transfer. See Section \ref{['par:hpdta']} for analysis.
  • Figure 4: The expected robustness metric offers a valuable perspective for model selection. The larger the area under the curve (shaded area), the higher the expected robustness. The values in the legend indicate the clean accuracy and the evaluation at $\varepsilon_g$.
  • Figure 5: Epsilon-Scheduling mitigates suboptimal transfers and consistently improves expected robustness even when robust accuracy is equivalent. Aggregated results from Table \ref{['tab:tab4']} and Table \ref{['tab:tab8']}.
  • ...and 7 more figures