Table of Contents
Fetching ...

Optimizing Specific and Shared Parameters for Efficient Parameter Tuning

Van-Anh Nguyen, Thanh-Toan Do, Mehrtash Harandi, Dinh Phung, Trung Le

TL;DR

SaS addresses the high cost of fine-tuning foundation models by separating adaptation into a shared cross-layer component and lightweight layer-specific hyper-network modules. The method uses low-rank projections for the shared module and hyper-networks to generate per-layer parameters, achieving substantial parameter efficiency (below 0.05% extra parameters) while improving performance. Empirical results across VTAB-1k, domain generalization benchmarks, and few-shot learning demonstrate that SaS outperforms or matches state-of-the-art PETL methods with far fewer trainable parameters. Ablation studies confirm that combining shared and layer-specific components yields the strongest gains, highlighting the value of capturing both common structure and depth-specific information in transfer learning.

Abstract

Foundation models, with a vast number of parameters and pretraining on massive datasets, achieve state-of-the-art performance across various applications. However, efficiently adapting them to downstream tasks with minimal computational overhead remains a challenge. Parameter-Efficient Transfer Learning (PETL) addresses this by fine-tuning only a small subset of parameters while preserving pre-trained knowledge. In this paper, we propose SaS, a novel PETL method that effectively mitigates distributional shifts during fine-tuning. SaS integrates (1) a shared module that captures common statistical characteristics across layers using low-rank projections and (2) a layer-specific module that employs hypernetworks to generate tailored parameters for each layer. This dual design ensures an optimal balance between performance and parameter efficiency while introducing less than 0.05% additional parameters, making it significantly more compact than existing methods. Extensive experiments on diverse downstream tasks, few-shot settings and domain generalization demonstrate that SaS significantly enhances performance while maintaining superior parameter efficiency compared to existing methods, highlighting the importance of capturing both shared and layer-specific information in transfer learning. Code and data are available at https://anonymous.4open.science/r/SaS-PETL-3565.

Optimizing Specific and Shared Parameters for Efficient Parameter Tuning

TL;DR

SaS addresses the high cost of fine-tuning foundation models by separating adaptation into a shared cross-layer component and lightweight layer-specific hyper-network modules. The method uses low-rank projections for the shared module and hyper-networks to generate per-layer parameters, achieving substantial parameter efficiency (below 0.05% extra parameters) while improving performance. Empirical results across VTAB-1k, domain generalization benchmarks, and few-shot learning demonstrate that SaS outperforms or matches state-of-the-art PETL methods with far fewer trainable parameters. Ablation studies confirm that combining shared and layer-specific components yields the strongest gains, highlighting the value of capturing both common structure and depth-specific information in transfer learning.

Abstract

Foundation models, with a vast number of parameters and pretraining on massive datasets, achieve state-of-the-art performance across various applications. However, efficiently adapting them to downstream tasks with minimal computational overhead remains a challenge. Parameter-Efficient Transfer Learning (PETL) addresses this by fine-tuning only a small subset of parameters while preserving pre-trained knowledge. In this paper, we propose SaS, a novel PETL method that effectively mitigates distributional shifts during fine-tuning. SaS integrates (1) a shared module that captures common statistical characteristics across layers using low-rank projections and (2) a layer-specific module that employs hypernetworks to generate tailored parameters for each layer. This dual design ensures an optimal balance between performance and parameter efficiency while introducing less than 0.05% additional parameters, making it significantly more compact than existing methods. Extensive experiments on diverse downstream tasks, few-shot settings and domain generalization demonstrate that SaS significantly enhances performance while maintaining superior parameter efficiency compared to existing methods, highlighting the importance of capturing both shared and layer-specific information in transfer learning. Code and data are available at https://anonymous.4open.science/r/SaS-PETL-3565.

Paper Structure

This paper contains 21 sections, 3 equations, 2 figures, 7 tables.

Figures (2)

  • Figure 1: Proposed architecture of SaS that consists of two modules: the shared module employs low-rank projection techniques to capture and reinforce the common statistical characteristics present in the dataset; Layer-specific module a Hyper-network to generate parameters for each individual layer, tailoring features at each abstract level.
  • Figure 2: Top-1 accuracy on fine-grained few-shot benchmark with ViT-B/16 as the backbone. Best viewed in color