Table of Contents
Fetching ...

Pear: Pruning and Sharing Adapters in Visual Parameter-Efficient Fine-Tuning

Yibo Zhong, Yao Zhou

TL;DR

Prune and Share (Pear) is proposed, a novel adapter-pruning framework for efficient fine-tuning of pretrained visual foundation models that preserves the information of the pruned adapters and further boosts performance.

Abstract

Adapters have been widely explored to alleviate computational and storage costs when fine-tuning pretrained foundation models. However, the adapter itself can exhibit redundancy, leading to unnecessary storage overhead and inferior performance. In this paper, we propose Prune and Share (Pear), a novel adapter-pruning framework for efficient fine-tuning of pretrained visual foundation models. Specifically, we prune certain adapters and share the more important unpruned ones with positions where adapters are pruned, allowing continual adaptation at these positions after pruning. Additionally, a knowledge checkpoint strategy is introduced, which preserves the information of the pruned adapters and further boosts performance. Experimental results on visual adaptation benchmark validate the effectiveness and efficiency of the proposed Pear comparing to other competitive methods. Code is in https://github.com/yibozhong/pear.

Pear: Pruning and Sharing Adapters in Visual Parameter-Efficient Fine-Tuning

TL;DR

Prune and Share (Pear) is proposed, a novel adapter-pruning framework for efficient fine-tuning of pretrained visual foundation models that preserves the information of the pruned adapters and further boosts performance.

Abstract

Adapters have been widely explored to alleviate computational and storage costs when fine-tuning pretrained foundation models. However, the adapter itself can exhibit redundancy, leading to unnecessary storage overhead and inferior performance. In this paper, we propose Prune and Share (Pear), a novel adapter-pruning framework for efficient fine-tuning of pretrained visual foundation models. Specifically, we prune certain adapters and share the more important unpruned ones with positions where adapters are pruned, allowing continual adaptation at these positions after pruning. Additionally, a knowledge checkpoint strategy is introduced, which preserves the information of the pruned adapters and further boosts performance. Experimental results on visual adaptation benchmark validate the effectiveness and efficiency of the proposed Pear comparing to other competitive methods. Code is in https://github.com/yibozhong/pear.
Paper Structure (11 sections, 3 equations, 3 figures, 3 tables)

This paper contains 11 sections, 3 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Comparison between vanilla structural pruning techniques and Pear pruning. For the example provided, we begin with a total of 18 parameters before pruning and aim to prune 50% of them. We first analyze the pruning process of these two methods, which are essentially the same in the first two phases but diverge in the third. Instead of simply pruning the redundant adapters and leaving the corresponding positions unadapted by adapters, Pear shares the unpruned adapters with these positions. Since we use one color to indicate the level of a adapter, the red and yellow adapters in the Pear result is obtained by sharing unpruned red and yellow adapters from other layers. This allows for further adaptation while avoiding any additional parameters. The term 'actual params' indicates the number of parameters after pruning, while 'params in effect' indicates the number of positions that are being adapted.
  • Figure 2: Illustration of the process of prune and share of Pear. The two low-rank adapters are combined and considered as a whole. The adapters at each position are first sorted based on their contributions, just like in vanilla pruning. For the example here we aim to prune 3 adapters in total, therefore we choose the 3 adapters with the least contribution (i,e, 3 left most adapters). After pruning, the 3 positions of index 0, 2 and 3 don't have any adaptation in vanilla pruning, which despite their seemingly trivial contribution, still could influence the overall adaptation performance. Pear tackles this issue by share the 3 adapters that are most contribute with these positions to enable continual adaption even when adapters in these positions are pruned. Additional proposed technique by Pear like knowledge checkpoint can also be employed under this framework.
  • Figure 3: Comparison between Pear and vanilla pruning on FGVC and StanfordCars datasets. Vanilla pruning is denoted as VP here.