Does Combining Parameter-efficient Modules Improve Few-shot Transfer Accuracy?
Nader Asadi, Mahdi Beitollahi, Yasser Khalil, Yinchuan Li, Guojun Zhang, Xi Chen
TL;DR
This work investigates the composability of parameter-efficient LoRA modules for few-shot transfer to unseen tasks. It analyzes two simple strategies—uniform averaging and learned interpolation—across vision and language models, showing that both improve few-shot transfer and that learned composition maintains competitiveness with full fine-tuning while using far fewer trainable parameters in many settings. The authors demonstrate robustness across label, task, and covariate shifts, and reveal that learned composition can selectively weight upstream modules aligned with downstream task similarity, as evidenced by CKA analyses. Overall, the findings highlight the potential of modular, add-on adapters to enhance transferability without full re-tuning, with implications for scalable and reusable foundation-model adaptation.
Abstract
Parameter-efficient fine-tuning stands as the standard for efficiently fine-tuning large language and vision models on downstream tasks. Specifically, the efficiency of low-rank adaptation has facilitated the creation and sharing of hundreds of custom LoRA modules, each trained on distinct data from various downstream tasks. In this paper, we explore the composability of LoRA modules, examining if combining these pre-trained modules enhances generalization to unseen downstream tasks. Our investigation involves evaluating two approaches: (a) uniform composition, involving averaging upstream LoRA modules with equal weights, and (b) learned composition, where we learn the weights for each upstream module and perform weighted averaging. Our experimental results on both vision and language models reveal that in few-shot settings, where only a limited number of samples are available for the downstream task, both uniform and learned composition methods result in better transfer accuracy; outperforming full fine-tuning and training a LoRA from scratch. Moreover, in full-shot settings, learned composition performs comparably to regular LoRA training with significantly fewer number of trainable parameters. Our research unveils the potential of uniform composition for enhancing transferability in low-shot settings, without introducing additional learnable parameters.
