Table of Contents
Fetching ...

UORA: Uniform Orthogonal Reinitialization Adaptation in Parameter-Efficient Fine-Tuning of Large Models

Xueyan Zhang, Jinman Zhao, Zhifei Yang, Yibo Zhong, Shuhao Guan, Linbo Cao, Yining Wang

TL;DR

UORA Tackles the high cost of fine-tuning large models by freezing projection matrices and introducing an interpolation-based reinitialization mechanism that selectively refreshes parts of the frozen adapters. The method achieves state-of-the-art parameter efficiency, often matching or surpassing LoRA while using far fewer trainable parameters, and demonstrates strong results across GLUE, E2E, instruction-tuning benchmarks, and image classification. By leveraging orthogonal uniform initialization and magnitude-based pruning, UORA maintains stable training and competitive accuracy with negligible inference overhead. This work offers a scalable, resource-efficient PEFT framework with broad applicability to NLP and CV tasks.

Abstract

This paper introduces Uniform Orthogonal Reinitialization Adaptation (UORA), a novel parameter-efficient fine-tuning (PEFT) approach for Large Language Models (LLMs). UORA achieves state-of-the-art performance and parameter efficiency by leveraging a low-rank approximation method to reduce the number of trainable parameters. Unlike existing methods such as LoRA and VeRA, UORA employs an interpolation-based reparametrization mechanism that selectively reinitializes rows and columns in frozen projection matrices, guided by the vector magnitude heuristic. This results in substantially fewer trainable parameters compared to LoRA and outperforms VeRA in computation and storage efficiency. Comprehensive experiments across various benchmarks demonstrate UORA's superiority in achieving competitive fine-tuning performance with negligible computational overhead. We demonstrate its performance on GLUE and E2E benchmarks and its effectiveness in instruction-tuning large language models and image classification models. Our contributions establish a new paradigm for scalable and resource-efficient fine-tuning of LLMs.

UORA: Uniform Orthogonal Reinitialization Adaptation in Parameter-Efficient Fine-Tuning of Large Models

TL;DR

UORA Tackles the high cost of fine-tuning large models by freezing projection matrices and introducing an interpolation-based reinitialization mechanism that selectively refreshes parts of the frozen adapters. The method achieves state-of-the-art parameter efficiency, often matching or surpassing LoRA while using far fewer trainable parameters, and demonstrates strong results across GLUE, E2E, instruction-tuning benchmarks, and image classification. By leveraging orthogonal uniform initialization and magnitude-based pruning, UORA maintains stable training and competitive accuracy with negligible inference overhead. This work offers a scalable, resource-efficient PEFT framework with broad applicability to NLP and CV tasks.

Abstract

This paper introduces Uniform Orthogonal Reinitialization Adaptation (UORA), a novel parameter-efficient fine-tuning (PEFT) approach for Large Language Models (LLMs). UORA achieves state-of-the-art performance and parameter efficiency by leveraging a low-rank approximation method to reduce the number of trainable parameters. Unlike existing methods such as LoRA and VeRA, UORA employs an interpolation-based reparametrization mechanism that selectively reinitializes rows and columns in frozen projection matrices, guided by the vector magnitude heuristic. This results in substantially fewer trainable parameters compared to LoRA and outperforms VeRA in computation and storage efficiency. Comprehensive experiments across various benchmarks demonstrate UORA's superiority in achieving competitive fine-tuning performance with negligible computational overhead. We demonstrate its performance on GLUE and E2E benchmarks and its effectiveness in instruction-tuning large language models and image classification models. Our contributions establish a new paradigm for scalable and resource-efficient fine-tuning of LLMs.

Paper Structure

This paper contains 43 sections, 5 equations, 2 figures, 15 tables.

Figures (2)

  • Figure 1: Overview of LoRA (left) and UORA (right). LoRA trains a pair of projection matrices, namely A and B, with low rank $r$. The update to the pretrained weights is thus represented as ${\Delta} W = A \times B$. UORA adopts the similar strategy as VeRA; both projection matrices are frozen and randomized. A pair of scaling vectors, $\vec{d}$ and $\vec{b}$, is trained to adapt the frozen matrices. The key difference is that UORA applies interpolation reinitialization mechanism to selectively and partially update A and B. Similar to all LoRA-based PEFT methods, the learned weight update ${\Delta} W$ could be merged into $W$ for zero inference latency.
  • Figure 2: Performance vs. number parameters of LoRA and UORA on MPRC in GLUE benchmark.