Table of Contents
Fetching ...

Robust Tickets Can Transfer Better: Drawing More Transferable Subnetworks in Transfer Learning

Yonggan Fu, Ye Yuan, Shang Wu, Jiayi Yuan, Yingyan Celine Lin

TL;DR

This work tackles the challenge of deploying transfer-learned representations on resource-constrained devices by introducing robust tickets, subnetworks identified from adversarially robust pretrained models. It proposes a two-stage pipeline that first robustifies dense models during pretraining and then extracts subnetworks via three pruning schemes to maximize downstream transfer under sparsity. Across diverse tasks, sparsity patterns, and pretraining schemes, robust tickets consistently outperform natural tickets in transfer accuracy, with stronger gains when domain gaps between source and target tasks are large. The findings point to a practical path for achieving improved accuracy-efficiency trade-offs in edge settings, potentially enabling more scalable and reliable deployment of pretrained representations.

Abstract

Transfer learning leverages feature representations of deep neural networks (DNNs) pretrained on source tasks with rich data to empower effective finetuning on downstream tasks. However, the pretrained models are often prohibitively large for delivering generalizable representations, which limits their deployment on edge devices with constrained resources. To close this gap, we propose a new transfer learning pipeline, which leverages our finding that robust tickets can transfer better, i.e., subnetworks drawn with properly induced adversarial robustness can win better transferability over vanilla lottery ticket subnetworks. Extensive experiments and ablation studies validate that our proposed transfer learning pipeline can achieve enhanced accuracy-sparsity trade-offs across both diverse downstream tasks and sparsity patterns, further enriching the lottery ticket hypothesis.

Robust Tickets Can Transfer Better: Drawing More Transferable Subnetworks in Transfer Learning

TL;DR

This work tackles the challenge of deploying transfer-learned representations on resource-constrained devices by introducing robust tickets, subnetworks identified from adversarially robust pretrained models. It proposes a two-stage pipeline that first robustifies dense models during pretraining and then extracts subnetworks via three pruning schemes to maximize downstream transfer under sparsity. Across diverse tasks, sparsity patterns, and pretraining schemes, robust tickets consistently outperform natural tickets in transfer accuracy, with stronger gains when domain gaps between source and target tasks are large. The findings point to a practical path for achieving improved accuracy-efficiency trade-offs in edge settings, potentially enabling more scalable and reliable deployment of pretrained representations.

Abstract

Transfer learning leverages feature representations of deep neural networks (DNNs) pretrained on source tasks with rich data to empower effective finetuning on downstream tasks. However, the pretrained models are often prohibitively large for delivering generalizable representations, which limits their deployment on edge devices with constrained resources. To close this gap, we propose a new transfer learning pipeline, which leverages our finding that robust tickets can transfer better, i.e., subnetworks drawn with properly induced adversarial robustness can win better transferability over vanilla lottery ticket subnetworks. Extensive experiments and ablation studies validate that our proposed transfer learning pipeline can achieve enhanced accuracy-sparsity trade-offs across both diverse downstream tasks and sparsity patterns, further enriching the lottery ticket hypothesis.
Paper Structure (13 sections, 2 equations, 9 figures, 2 tables)

This paper contains 13 sections, 2 equations, 9 figures, 2 tables.

Figures (9)

  • Figure 1: Comparing the whole model finetuning accuracy of robust tickets and natural tickets identified via OMP from ResNet18/50 on CIFAR-10/100, with zoom-ins for the extreme sparsity ($90\%\sim99\%$).
  • Figure 2: Comparing the linear evaluation accuracy of robust tickets and natural tickets identified via OMP.
  • Figure 3: Evaluating structured robust tickets over natural ones discovered via OMP from ResNet50.
  • Figure 4: Benchmark robust tickets with natural ones discovered by IMP variants. Accuracy under high sparsity is zoomed in.
  • Figure 5: Benchmark robust tickets with natural tickets discovered by LMP from ResNet18/50 on CIFAR-10/100.
  • ...and 4 more figures