Table of Contents
Fetching ...

ORAL: Prompting Your Large-Scale LoRAs via Conditional Recurrent Diffusion

Rana Muhammad Shahroz Khan, Dongwen Tang, Pingzhi Li, Kai Wang, Tianlong Chen

TL;DR

ORAL introduces a scalable, conditional recurrent diffusion framework that generates LoRA updates for large-scale, evolving foundation models by conditioning on both base-model architecture and textual task prompts. By tokenizing LoRA updates and applying a recurrent diffusion backbone, ORAL achieves high-capacity parameter generation (up to hundreds of millions of parameters) while maintaining task-specific controllability and transferability across model updates without retraining. Extensive experiments across vision, multimodal, and NLP tasks show ORAL matching or surpassing traditional fine-tuning baselines and strong generalization to unseen evolving models. This approach enables efficient, flexible adaptation in rapidly changing LLM ecosystems, reducing retraining costs and enabling practical deployment at scale.

Abstract

Parameter generation has emerged as a novel paradigm for neural network development, offering an alternative to traditional neural network training by synthesizing high-quality model weights directly. In the context of Low-Rank Adaptation (LoRA) for evolving ($\textit{i.e.}$, constantly updated) large language models (LLMs), this approach promises efficient adaptation without costly retraining. However, existing methods face critical limitations in simultaneously achieving scalability and controllability. In this paper, we introduce $\texttt{ORAL}$, a novel $\textbf{conditional recurrent diffusion}$ framework that addresses these challenges. $\texttt{ORAL}$ incorporates a novel conditioning mechanism that integrates model architecture and textual task specifications, enabling the generation of task-specific LoRA parameters that can seamlessly transfer across evolving foundation models. Our approach successfully scales to billions-of-parameter LLMs and maintains controllability. Through extensive experiments across seven language tasks, four vision tasks, and three multimodal tasks using five pre-trained LLMs, we demonstrate that $\texttt{ORAL}$ generates high-quality LoRA parameters that achieve comparable or superior performance to vanilla trained counterparts.

ORAL: Prompting Your Large-Scale LoRAs via Conditional Recurrent Diffusion

TL;DR

ORAL introduces a scalable, conditional recurrent diffusion framework that generates LoRA updates for large-scale, evolving foundation models by conditioning on both base-model architecture and textual task prompts. By tokenizing LoRA updates and applying a recurrent diffusion backbone, ORAL achieves high-capacity parameter generation (up to hundreds of millions of parameters) while maintaining task-specific controllability and transferability across model updates without retraining. Extensive experiments across vision, multimodal, and NLP tasks show ORAL matching or surpassing traditional fine-tuning baselines and strong generalization to unseen evolving models. This approach enables efficient, flexible adaptation in rapidly changing LLM ecosystems, reducing retraining costs and enabling practical deployment at scale.

Abstract

Parameter generation has emerged as a novel paradigm for neural network development, offering an alternative to traditional neural network training by synthesizing high-quality model weights directly. In the context of Low-Rank Adaptation (LoRA) for evolving (, constantly updated) large language models (LLMs), this approach promises efficient adaptation without costly retraining. However, existing methods face critical limitations in simultaneously achieving scalability and controllability. In this paper, we introduce , a novel framework that addresses these challenges. incorporates a novel conditioning mechanism that integrates model architecture and textual task specifications, enabling the generation of task-specific LoRA parameters that can seamlessly transfer across evolving foundation models. Our approach successfully scales to billions-of-parameter LLMs and maintains controllability. Through extensive experiments across seven language tasks, four vision tasks, and three multimodal tasks using five pre-trained LLMs, we demonstrate that generates high-quality LoRA parameters that achieve comparable or superior performance to vanilla trained counterparts.

Paper Structure

This paper contains 39 sections, 14 equations, 5 figures, 4 tables.

Figures (5)

  • Figure 1: Overview of our ORAL framework. (a) Recurrent Generation: The foundation model weights are processed through a tokenizer to create weight tokens, which are then fed into a recurrent architecture consisting of both Mamba and Diffusion models to generate parameters from noise $\mathcal{N}(0,I)$. (b) Conditional Generation: Our approach supports evolving foundation model by adapting parameters $\mathtt{W}_i$ to updated foundation models $\mathtt{W}_i + \Delta_i$ without retraining, which is enabled by our novel conditioning mechanism that incorporates model architecture and textual task specifications.
  • Figure 2: Comparison of parameter generation capacity between our proposed method, ORAL, and the baseline Cond P-Diff. ORAL effectively generates LoRA adapters at a significantly larger scale (e.g., 7B Models), surpassing the capacity of Cond P-Diff, which fails to operate efficiently at higher parameter scales. This demonstrates ORAL's ability to handle large-scale parameter synthesis, crucial for adapting modern large language models.
  • Figure 3: Ablation results showing accuracy comparisons across NLP tasks using random model embeddings, random textual embeddings, and our method with meaningful embeddings. Higher accuracy achieved by our conditional embeddings highlights their importance in guiding effective LoRA adapter generation.
  • Figure 4: Accuracy comparison of our synthesized LoRA adapters against zero-shot base models on the unseen evolved Mistral continually pretrained on AlpacaGPT4.
  • Figure 5: Accuracy comparison of our synthesized LoRA adapters against zero-shot base models on the unseen evolved Mistral continually pretrained on GPT4LLM.