Amortized Bayesian Meta-Learning for Low-Rank Adaptation of Large Language Models

Liyi Zhang; Jake Snell; Thomas L. Griffiths

Amortized Bayesian Meta-Learning for Low-Rank Adaptation of Large Language Models

Liyi Zhang, Jake Snell, Thomas L. Griffiths

TL;DR

The paper introduces ABMLL, a scalable Amortized Bayesian Meta-Learning approach for LoRA-tuned large language models, enabling task-conditioned uncertainty modeling without per-task parameter copies. By expressing both global and task-specific weights with LoRA adapters and employing a variational Bayesian objective with a beta-balanced reconstruction term, ABMLL delivers improved generalization to unseen tasks and better uncertainty calibration on large models like Llama3-8B. Empirical results on CrossFit and UnifiedQA show ABMLL outperforming standard LoRA and other meta-learning baselines in accuracy and ECE, while remaining memory-efficient and robust to pruning. The work bridges Bayesian methods and LLM fine-tuning, highlighting the potential for inductive bias and reliable uncertainty estimation in scalable meta-learning for large models.

Abstract

Fine-tuning large language models (LLMs) with low-rank adaptation (LoRA) is a cost-effective way to incorporate information from a specific dataset. However, it is often unclear how well the fine-tuned LLM will generalize, i.e., how well it will perform on unseen datasets. Methods have been proposed to improve generalization by optimizing in-context prompts, or by using meta-learning to fine-tune LLMs. However, these methods are expensive in memory and computation, requiring either long-context prompts or saving copies of parameters and using second-order gradient updates. To address these challenges, we propose Amortized Bayesian Meta-Learning for LoRA (ABMLL). This method builds on amortized Bayesian meta-learning for smaller models, adapting this approach to LLMs while maintaining its computational efficiency. We reframe task-specific and global parameters in the context of LoRA and use a new hyperparameter to balance reconstruction accuracy and the fidelity of task-specific parameters to the global ones. ABMLL provides effective generalization and scales to large models such as LLAMA3-8B. Furthermore, as a result of using a Bayesian framework, ABMLL provides improved uncertainty quantification. We test ABMLL on CrossFit and Unified-QA datasets and find that it outperforms existing methods on these benchmarks in terms of both accuracy and expected calibration error.

Amortized Bayesian Meta-Learning for Low-Rank Adaptation of Large Language Models

TL;DR

Abstract

Amortized Bayesian Meta-Learning for Low-Rank Adaptation of Large Language Models

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (2)