Provable Meta-Learning with Low-Rank Adaptations
Jacob L. Block, Sundararajan Srinivasan, Liam Collins, Aryan Mokhtari, Sanjay Shakkottai
TL;DR
This work addresses how to train foundation models so they can be quickly adapted to unseen tasks via PEFT-based fine-tuning. It introduces a PEFT-ML framework that integrates low-rank adapters during retraining (LoRA-ML), guaranteeing that the learned base parameters are readily adaptable to future tasks. The authors prove that standard retraining is inherently suboptimal for low-rank adaptation, while LoRA-ML achieves an optimal adaptation rate of $Oig(rac{kd}{n}ig)$ and, for $T\ge 3$, exact recovery of ground-truth parameters up to orthogonal symmetry; in the two-task case, a strict saddle property ensures efficient optimization. Empirical results on synthetic data, CIFAR-10, and ConvAI2 demonstrate consistent improvements of LoRA-ML over standard retraining and gradient-based meta-learning baselines across LoRA and last-layer fine-tuning schemes.
Abstract
The power of foundation models (FMs) lies in their capacity to learn highly expressive representations that can be adapted to a broad spectrum of tasks. However, these pretrained models require additional training stages to become effective for downstream applications. In the multi-task setting, prior works have shown empirically that specific meta-learning approaches for preparing a model for future adaptation through parameter-efficient fine-tuning (PEFT) can outperform standard retraining methods, but the mechanism of the benefits of meta-learning has been largely unexplored. We introduce a framework for generic PEFT-based meta-learning to learn a model that can easily adapt to unseen tasks. For linear models using LoRA, we show that standard retraining is provably suboptimal for finding an adaptable set of parameters and provide strict performance guarantees for our proposed method. We verify these theoretical insights through experiments on synthetic data as well as real-data vision and language tasks. We observe significant performance benefits using a simple implementation of our proposed meta-learning scheme during retraining relative to the conventional approach.
