Unlocking Tuning-Free Few-Shot Adaptability in Visual Foundation Models by Recycling Pre-Tuned LoRAs

Zixuan Hu; Yongxian Wei; Li Shen; Chun Yuan; Dacheng Tao

Unlocking Tuning-Free Few-Shot Adaptability in Visual Foundation Models by Recycling Pre-Tuned LoRAs

Zixuan Hu, Yongxian Wei, Li Shen, Chun Yuan, Dacheng Tao

TL;DR

This work tackles the lack of tuning-free few-shot adaptability in Visual Foundation Models by reusing diverse pre-tuned LoRAs without access to their original data. It introduces LoRA Inversion to synthesize surrogate data and a meta-learning objective to distill a lightweight meta-LoRA, enabling single-pass adaptation on new tasks in a tuning-free manner. A double-efficient mechanism accelerates both data inversion and meta-training by selectively keeping informative tokens, improving efficiency without sacrificing performance. Across in-domain and cross-domain benchmarks, LoRA Recycle demonstrates solid improvements over fine-tuning baselines and other tuning-free methods, highlighting its practical potential for rapid, private-data-free VFM deployment.

Abstract

Large Language Models (LLMs) such as ChatGPT demonstrate strong few-shot adaptability without requiring fine-tuning, positioning them ideal for data-limited and real-time applications. However, this adaptability has not yet been replicated in current Visual Foundation Models (VFMs), which require explicit fine-tuning with sufficient tuning data. Besides, the pretraining-finetuning paradigm has led to the surge of numerous task-specific modular components, such as Low-Rank Adaptation (LoRA). For the first time, we explore the potential of reusing diverse pre-tuned LoRAs without accessing their original training data, to achieve tuning-free few-shot adaptation in VFMs. Our framework, LoRA Recycle, distills a meta-LoRA from diverse pre-tuned LoRAs with a meta-learning objective, using surrogate data generated inversely from pre-tuned LoRAs themselves. The VFM, once equipped with the meta-LoRA, is empowered to solve new few-shot tasks in a single forward pass, akin to the in-context learning of LLMs. Additionally, we incorporate a double-efficient mechanism tailored to our framework, significantly accelerating the meta-training process while maintaining or even improving performance. Extensive experiments across various few-shot classification benchmarks across both in- and cross-domain scenarios demonstrate the superiority of our framework.

Unlocking Tuning-Free Few-Shot Adaptability in Visual Foundation Models by Recycling Pre-Tuned LoRAs

TL;DR

Abstract

Unlocking Tuning-Free Few-Shot Adaptability in Visual Foundation Models by Recycling Pre-Tuned LoRAs

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (9)