HydraLoRA: An Asymmetric LoRA Architecture for Efficient Fine-Tuning
Chunlin Tian, Zhan Shi, Zhijiang Guo, Li Li, Chengzhong Xu
TL;DR
This work tackles the inefficiency of conventional LoRA in heterogeneous, multi-task domains by revealing that a single LoRA head causes cross-task interference. It introduces HydraLoRA, an asymmetric LoRA architecture with a shared matrix $A$ and multiple task-specific matrices $B_i$, guided by a Mixture-of-Experts router to automatically allocate inputs to appropriate adapters. Empirical results across single-domain and multi-task benchmarks show HydraLoRA consistently surpasses standard PEFT methods and even LoRA with task-specific splits, while reducing parameter overhead through shared learning and modular specialization. The approach enables domain-robust fine-tuning and efficient inference, offering a practical path to high-performance, low-parameter LLM adaptation in complex real-world tasks.
Abstract
Adapting Large Language Models (LLMs) to new tasks through fine-tuning has been made more efficient by the introduction of Parameter-Efficient Fine-Tuning (PEFT) techniques, such as LoRA. However, these methods often underperform compared to full fine-tuning, particularly in scenarios involving complex datasets. This issue becomes even more pronounced in complex domains, highlighting the need for improved PEFT approaches that can achieve better performance. Through a series of experiments, we have uncovered two critical insights that shed light on the training and parameter inefficiency of LoRA. Building on these insights, we have developed HydraLoRA, a LoRA framework with an asymmetric structure that eliminates the need for domain expertise. Our experiments demonstrate that HydraLoRA outperforms other PEFT approaches, even those that rely on domain knowledge during the training and inference phases.
