RoSA: Enhancing Parameter-Efficient Fine-Tuning via RoPE-aware Selective Adaptation in Large Language Models
Dayan Pan, Jingyuan Wang, Yilong Zhou, Jiawei Cheng, Pengyue Jia, Xiangyu Zhao
TL;DR
RoSA tackles the inefficiency of traditional PEFT by exploiting RoPE-induced low-frequency attention components and layer-wise heterogeneity. It introduces RoAE to selectively enhance RoPE low-frequency dimensions and DLS to dynamically update the most impactful layers, guided by LayerNorm gradient norms. Across fifteen benchmarks and multiple backbones, RoSA consistently outperforms mainstream PEFT methods under comparable trainable parameter budgets and scales effectively with model size. The framework is modular and broadly applicable to PEFT, offering a principled approach to frequency- and layer-aware fine-tuning with practical improvements in contextual understanding and efficiency.
Abstract
Fine-tuning large language models is essential for task-specific adaptation, yet it remains computationally prohibitive. Parameter-Efficient Fine-Tuning (PEFT) methods have emerged as a solution, but current approaches typically ignore the distinct roles of model components and the heterogeneous importance across layers, thereby limiting adaptation efficiency. Motivated by the observation that Rotary Position Embeddings (RoPE) induce critical activations in the low-frequency dimensions of attention states, we propose RoPE-aware Selective Adaptation (RoSA), a novel PEFT framework that allocates trainable parameters in a more targeted and effective manner. RoSA comprises a RoPE-aware Attention Enhancement (RoAE) module, which selectively enhances the low-frequency components of RoPE-influenced attention states, and a Dynamic Layer Selection (DLS) strategy that adaptively identifies and updates the most critical layers based on LayerNorm gradient norms. By combining dimension-wise enhancement with layer-wise adaptation, RoSA achieves more targeted and efficient fine-tuning. Extensive experiments on fifteen commonsense and arithmetic benchmarks demonstrate that RoSA outperforms existing mainstream PEFT methods under comparable trainable parameters. The code is available to ease reproducibility at https://github.com/Applied-Machine-Learning-Lab/RoSA.
