Table of Contents
Fetching ...

MTA: A Merge-then-Adapt Framework for Personalized Large Language Model

Xiaopeng Li, Yuanjin Zheng, Wanyu Wang, wenlin zhang, Pengyue Jia, Yiqi Wang, Maolin Wang, Xuetao Wei, Xiangyu Zhao

TL;DR

MTA addresses scalability and data-sparsity in personalized LLMs by introducing a three-stage Merge-then-Adapt framework. It builds a fixed Meta-LoRA Bank of anchor LoRAs, retrieves and linearly merges the most similar anchors to create a personalized base, and finally stacks an ultra-low-rank adaptation trained on the target user’s sparse history. The approach combines cross-user collaborative knowledge with a lightweight, user-specific residual to deliver robust personalization without per-user storage overhead. Experiments on the LaMP benchmark show state-of-the-art performance across five tasks, with favorable efficiency and ablation results demonstrating the necessity of both the merge and the final adaptation step. Overall, MTA enables scalable, data-efficient PLLMs suitable for real-world, large-scale deployment, reducing both training time and parameter storage while maintaining high personalization quality.

Abstract

Personalized Large Language Models (PLLMs) aim to align model outputs with individual user preferences, a crucial capability for user-centric applications. However, the prevalent approach of fine-tuning a separate module for each user faces two major limitations: (1) storage costs scale linearly with the number of users, rendering the method unscalable; and (2) fine-tuning a static model from scratch often yields suboptimal performance for users with sparse data. To address these challenges, we propose MTA, a Merge-then-Adapt framework for PLLMs. MTA comprises three key stages. First, we construct a shared Meta-LoRA Bank by selecting anchor users and pre-training meta-personalization traits within meta-LoRA modules. Second, to ensure scalability and enable dynamic personalization combination beyond static models, we introduce an Adaptive LoRA Fusion stage. This stage retrieves and dynamically merges the most relevant anchor meta-LoRAs to synthesize a user-specific one, thereby eliminating the need for user-specific storage and supporting more flexible personalization. Third, we propose a LoRA Stacking for Few-Shot Personalization stage, which applies an additional ultra-low-rank, lightweight LoRA module on top of the merged LoRA. Fine-tuning this module enables effective personalization under few-shot settings. Extensive experiments on the LaMP benchmark demonstrate that our approach outperforms existing SOTA methods across multiple tasks.

MTA: A Merge-then-Adapt Framework for Personalized Large Language Model

TL;DR

MTA addresses scalability and data-sparsity in personalized LLMs by introducing a three-stage Merge-then-Adapt framework. It builds a fixed Meta-LoRA Bank of anchor LoRAs, retrieves and linearly merges the most similar anchors to create a personalized base, and finally stacks an ultra-low-rank adaptation trained on the target user’s sparse history. The approach combines cross-user collaborative knowledge with a lightweight, user-specific residual to deliver robust personalization without per-user storage overhead. Experiments on the LaMP benchmark show state-of-the-art performance across five tasks, with favorable efficiency and ablation results demonstrating the necessity of both the merge and the final adaptation step. Overall, MTA enables scalable, data-efficient PLLMs suitable for real-world, large-scale deployment, reducing both training time and parameter storage while maintaining high personalization quality.

Abstract

Personalized Large Language Models (PLLMs) aim to align model outputs with individual user preferences, a crucial capability for user-centric applications. However, the prevalent approach of fine-tuning a separate module for each user faces two major limitations: (1) storage costs scale linearly with the number of users, rendering the method unscalable; and (2) fine-tuning a static model from scratch often yields suboptimal performance for users with sparse data. To address these challenges, we propose MTA, a Merge-then-Adapt framework for PLLMs. MTA comprises three key stages. First, we construct a shared Meta-LoRA Bank by selecting anchor users and pre-training meta-personalization traits within meta-LoRA modules. Second, to ensure scalability and enable dynamic personalization combination beyond static models, we introduce an Adaptive LoRA Fusion stage. This stage retrieves and dynamically merges the most relevant anchor meta-LoRAs to synthesize a user-specific one, thereby eliminating the need for user-specific storage and supporting more flexible personalization. Third, we propose a LoRA Stacking for Few-Shot Personalization stage, which applies an additional ultra-low-rank, lightweight LoRA module on top of the merged LoRA. Fine-tuning this module enables effective personalization under few-shot settings. Extensive experiments on the LaMP benchmark demonstrate that our approach outperforms existing SOTA methods across multiple tasks.

Paper Structure

This paper contains 29 sections, 8 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: An overview of the MTA (Merge-then-Adapt) framework is as follows: The left panel, Meta-LoRA Bank Construction, shows the pre-training of a bank of anchor modules and the retrieval of the top two most relevant LoRAs for a target user. The middle panel, Adaptive LoRA Merging, shows the two retrieved LoRAs being combined through a coefficient-weighted merge to create a single, personalized LoRA. The right panel, LoRA Stacking for Few-Shot Personalization, depicts the freezing of the merged model and the fine-tuning of a new, ultra low-rank LoRA on top of it using the user's history for final adaptation.
  • Figure 2: Efficiency comparison between our MTA framework and the OPPU baseline. The top plot shows total training time versus the number of users; the bottom plot shows total parameter storage.
  • Figure 3: Performance comparison on Personalized Product Rating Prediction (LaMP-3) (top row, MAE/RMSE, lower is better) and Personalized Scholarly Title Generation (LaMP-5) (bottom row, ROUGE-1/L, higher is better) tasks with different fixed merging coefficients ($\alpha_u$) versus our adaptive method (red dashed line). The blue solid line shows the performance of fixed-alpha variants.
  • Figure 4: Performance on Personalized Product Rating Prediction (LaMP-3) (top row, MAE/RMSE, lower is better) and Personalized Scholarly Title Generation (LaMP-5) (bottom row, ROUGE-1/L, higher is better) as a function of the number of merged top-K anchor users.
  • Figure 5: Case study on personalized title generation (LaMP-5) for a user specializing in Computer Vision. Our full framework generates the most accurate title by capturing the user's task-oriented focus.
  • ...and 1 more figures