PEFT A2Z: Parameter-Efficient Fine-Tuning Survey for Large Language and Vision Models
Nusrat Jahan Prottasha, Upama Roy Chowdhury, Shetu Mohanto, Tasfia Nuzhat, Abdullah As Sami, Md Shamol Ali, Md Shohanur Islam Sobuj, Hafijur Raman, Md Kowsher, Ozlem Ozmen Garibay
TL;DR
This survey analyzes the resource and fine-tuning challenges of large language, vision, and multimodal models and advocates parameter-efficient fine-tuning (PEFT) as a scalable solution. It introduces a unified taxonomy—additive, selective, reparameterized, hybrid, and unified approaches—and details design considerations (quantization, routing, memory, KV-cache, pruning, energy, multimodal). Through cross-domain evaluation (NLP, vision, multimodal, and robotics), it shows that PEFT methods like LoRA, adapters, RoCoFT, Propulsion, and SK-Tuning can approach or surpass full fine-tuning performance with far fewer trainable parameters. The paper also discusses open challenges (interpretability, theory, benchmarks, privacy, hardware considerations) and outlines future directions, including federated and continual learning, to broaden PEFT’s practical impact. Overall, PEFT emerges as a practical, scalable pathway to democratize the deployment of massive foundation models while curbing computational and environmental costs.
Abstract
Large models such as Large Language Models (LLMs) and Vision Language Models (VLMs) have transformed artificial intelligence, powering applications in natural language processing, computer vision, and multimodal learning. However, fully fine-tuning these models remains expensive, requiring extensive computational resources, memory, and task-specific data. Parameter-Efficient Fine-Tuning (PEFT) has emerged as a promising solution that allows adapting large models to downstream tasks by updating only a small portion of parameters. This survey presents a comprehensive overview of PEFT techniques, focusing on their motivations, design principles, and effectiveness. We begin by analyzing the resource and accessibility challenges posed by traditional fine-tuning and highlight key issues, such as overfitting, catastrophic forgetting, and parameter inefficiency. We then introduce a structured taxonomy of PEFT methods -- grouped into additive, selective, reparameterized, hybrid, and unified frameworks -- and systematically compare their mechanisms and trade-offs. Beyond taxonomy, we explore the impact of PEFT across diverse domains, including language, vision, and generative modeling, showing how these techniques offer strong performance with lower resource costs. We also discuss important open challenges in scalability, interpretability, and robustness, and suggest future directions such as federated learning, domain adaptation, and theoretical grounding. Our goal is to provide a unified understanding of PEFT and its growing role in enabling practical, efficient, and sustainable use of large models.
