Table of Contents
Fetching ...

Towards Modular LLMs by Building and Reusing a Library of LoRAs

Oleksiy Ostapenko, Zhan Su, Edoardo Maria Ponti, Laurent Charlin, Nicolas Le Roux, Matheus Pereira, Lucas Caccia, Alessandro Sordoni

TL;DR

This work investigates building and reusing a library of LoRA adapters to create modular LLMs. It introduces Model-Based Clustering (MBC) to group tasks by the similarity of their LoRA weights and Arrow routing to select relevant adapters in a zero-shot setting without access to training data. Experiments on Phi-2 and Mistral across 256 tasks show that MBC-based adapters plus Arrow routing achieve strong generalization to unseen tasks, often rivaling or surpassing fully joint training under certain conditions. Overall, the approach enables asynchronous, collaborative adapter development with efficient routing to form flexible, scalable modular LLMs.

Abstract

The growing number of parameter-efficient adaptations of a base large language model (LLM) calls for studying whether we can reuse such trained adapters to improve performance for new tasks. We study how to best build a library of adapters given multi-task data and devise techniques for both zero-shot and supervised task generalization through routing in such library. We benchmark existing approaches to build this library and introduce model-based clustering, MBC, a method that groups tasks based on the similarity of their adapter parameters, indirectly optimizing for transfer across the multi-task dataset. To re-use the library, we present a novel zero-shot routing mechanism, Arrow, which enables dynamic selection of the most relevant adapters for new inputs without the need for retraining. We experiment with several LLMs, such as Phi-2 and Mistral, on a wide array of held-out tasks, verifying that MBC-based adapters and Arrow routing lead to superior generalization to new tasks. We make steps towards creating modular, adaptable LLMs that can match or outperform traditional joint training.

Towards Modular LLMs by Building and Reusing a Library of LoRAs

TL;DR

This work investigates building and reusing a library of LoRA adapters to create modular LLMs. It introduces Model-Based Clustering (MBC) to group tasks by the similarity of their LoRA weights and Arrow routing to select relevant adapters in a zero-shot setting without access to training data. Experiments on Phi-2 and Mistral across 256 tasks show that MBC-based adapters plus Arrow routing achieve strong generalization to unseen tasks, often rivaling or surpassing fully joint training under certain conditions. Overall, the approach enables asynchronous, collaborative adapter development with efficient routing to form flexible, scalable modular LLMs.

Abstract

The growing number of parameter-efficient adaptations of a base large language model (LLM) calls for studying whether we can reuse such trained adapters to improve performance for new tasks. We study how to best build a library of adapters given multi-task data and devise techniques for both zero-shot and supervised task generalization through routing in such library. We benchmark existing approaches to build this library and introduce model-based clustering, MBC, a method that groups tasks based on the similarity of their adapter parameters, indirectly optimizing for transfer across the multi-task dataset. To re-use the library, we present a novel zero-shot routing mechanism, Arrow, which enables dynamic selection of the most relevant adapters for new inputs without the need for retraining. We experiment with several LLMs, such as Phi-2 and Mistral, on a wide array of held-out tasks, verifying that MBC-based adapters and Arrow routing lead to superior generalization to new tasks. We make steps towards creating modular, adaptable LLMs that can match or outperform traditional joint training.
Paper Structure (25 sections, 2 equations, 5 figures, 10 tables, 2 algorithms)

This paper contains 25 sections, 2 equations, 5 figures, 10 tables, 2 algorithms.

Figures (5)

  • Figure 1: How to coordinate a library of adapters (e.g., LoRAs) for zero-shot generalization to new tasks? To build this library (top), we propose MBC, a novel method that clusters tasks based on the similarity of the parameters of corresponding LoRAs. To reuse a library (either private or MBC, bottom), we route hidden states to trained LoRAs via Arrow, which leverages the SVD decomposition of each LoRA.
  • Figure 2: For any pair of tasks, we report the cosine similarity between the corresponding LoRA weights (x-axis) against the delta in performance between LoRAs trained on them individually and jointly (y-axis). The positive correlation indicates that if LoRAs are dissimilar, we should abstain from multi-task training.
  • Figure 3: Comparison of routing approaches with both Private and MBC libraries. Left & Middle. Downstream zero-shot performance on two backbones; Arrow outperforms other routing approaches in the case of private libraries, while in the case of MBC libraries, routing is less important. Right. Upstream performance on the held-out sets of each of the 256 training tasks. Arrow nearly matches Oracle routing (which uses information about the task identity) in the case of Private library and noticeably improves for MBC.
  • Figure 4: Phi-2 zero-shot accuracy on the 10 held-out tasks (left) and validation log-likelihood on the training tasks (right) as a function of the number of MBC clusters.
  • Figure 5: Histogram of the ratios $r$ computed over 5000 samples.