Bridging The Gap between Low-rank and Orthogonal Adaptation via Householder Reflection Adaptation

Shen Yuan; Haotian Liu; Hongteng Xu

Bridging The Gap between Low-rank and Orthogonal Adaptation via Householder Reflection Adaptation

Shen Yuan, Haotian Liu, Hongteng Xu

TL;DR

This study proposes a simple but effective adaptation method based on Householder reflections, which achieves superior performance with fewer learnable parameters when adapting large language models and conditional image generators.

Abstract

While following different technical routes, both low-rank and orthogonal adaptation techniques can efficiently adapt large-scale pre-training models in specific tasks or domains based on a small piece of trainable parameters. In this study, we bridge the gap between these two techniques, proposing a simple but effective adaptation method based on Householder reflections. Given a pre-trained model, our method fine-tunes its layers by multiplying each frozen weight matrix with an orthogonal matrix constructed by a chain of learnable Householder reflections (HRs). This HR-based orthogonal fine-tuning is equivalent to an adaptive low-rank adaptation. Moreover, we show that the orthogonality of the reflection planes corresponding to the HRs impacts the model capacity and regularity. The analysis motivates us to regularize the orthogonality of the HRs, leading to different implementations of the proposed Householder reflection adaptation (HRA) method. Compared with state-of-the-art methods, HRA achieves superior performance with fewer learnable parameters when adapting large language models and conditional image generators. The code of the experiments is available at \url{https://github.com/DaShenZi721/HRA}, and the method has been merged into the \href{https://github.com/huggingface/peft}{PEFT} package.

Bridging The Gap between Low-rank and Orthogonal Adaptation via Householder Reflection Adaptation

TL;DR

Abstract

Paper Structure (28 sections, 9 equations, 11 figures, 7 tables)

This paper contains 28 sections, 9 equations, 11 figures, 7 tables.

Introduction
Related Work and Preliminaries
Low-rank Adaptation (LoRA)
Orthogonal Fine-tuning (OFT)
Proposed Method
Model Adaptation via Learning A Chain of Householder Reflections
Comparisons with Existing OFT Methods
Connections with Low-rank Adaptation
Enhancing The Orthogonality of Householder Reflections for Stronger Regularity
Experiments
Natural Language Understanding
Mathematical Reasoning of LLM
Controllable Text-to-Image Diffusion Models
Conclusion
The Impacts of Orthogonality
...and 13 more sections

Figures (11)

Figure 1: (a) An illustration of our HRA method. (b) Comparisons for various methods on GLUE benchmark wang2019glue. The x-axis corresponds to the number of trainable parameters (M), and the y-axis corresponds to the average score (%). (c) Comparisons for various methods on the ratio of trainable parameters and accuracy (%) when adapting LLaMA2-7B touvron2023llama in mathematical reasoning tasks.
Figure 2: A 2D illustration indicating that when the reflection planes $\mathcal{H}_1$ and $\mathcal{H}_2$ are orthogonal, the distance $\|\bm{H}_2\bm{H}_1\bm{w}-\bm{w}\|_2$ is maximized.
Figure 3: The robustness of HRA ($r=8$) to $\lambda$ on MRPC.
Figure 4: The robustness of HRA ($r=8$) to $\lambda$ in mathematical reasoning tasks.
Figure 5: Qualitative results on subject-driven generation.
...and 6 more figures

Bridging The Gap between Low-rank and Orthogonal Adaptation via Householder Reflection Adaptation

TL;DR

Abstract

Bridging The Gap between Low-rank and Orthogonal Adaptation via Householder Reflection Adaptation

Authors

TL;DR

Abstract

Table of Contents

Figures (11)