Table of Contents
Fetching ...

Quantum-PEFT: Ultra parameter-efficient fine-tuning

Toshiaki Koike-Akino, Francesco Tonin, Yongtao Wu, Frank Zhengqing Wu, Leyla Naz Candogan, Volkan Cevher

TL;DR

The paper tackles the rising cost of fine-tuning large pre-trained models by introducing Quantum-PEFT, a quantum-inspired, parameter-efficient fine-tuning framework. It reparameterizes weight updates as ultra-compact unitary embeddings using Pauli rotations, mapping to the Stiefel manifold and assembling larger unitaries via quantum Shannon decomposition to handle arbitrary dimensions. The approach achieves orders-of-magnitude reductions in trainable parameters (often 4–25× vs LoRA) while maintaining competitive accuracy on GLUE, E2E, large-scale GPT-2, and ViT CIFAR10 benchmarks, and benefits further from quantization and intrinsic-rank masking. These results suggest Quantum-PEFT enables scalable, memory-efficient fine-tuning for billion-parameter models with practical implications for deployment and experimentation.

Abstract

This paper introduces Quantum-PEFT that leverages quantum computations for parameter-efficient fine-tuning (PEFT). Unlike other additive PEFT methods, such as low-rank adaptation (LoRA), Quantum-PEFT exploits an underlying full-rank yet surprisingly parameter efficient quantum unitary parameterization. With the use of Pauli parameterization, the number of trainable parameters grows only logarithmically with the ambient dimension, as opposed to linearly as in LoRA-based PEFT methods. Quantum-PEFT achieves vanishingly smaller number of trainable parameters than the lowest-rank LoRA as dimensions grow, enhancing parameter efficiency while maintaining a competitive performance. We apply Quantum-PEFT to several transfer learning benchmarks in language and vision, demonstrating significant advantages in parameter efficiency.

Quantum-PEFT: Ultra parameter-efficient fine-tuning

TL;DR

The paper tackles the rising cost of fine-tuning large pre-trained models by introducing Quantum-PEFT, a quantum-inspired, parameter-efficient fine-tuning framework. It reparameterizes weight updates as ultra-compact unitary embeddings using Pauli rotations, mapping to the Stiefel manifold and assembling larger unitaries via quantum Shannon decomposition to handle arbitrary dimensions. The approach achieves orders-of-magnitude reductions in trainable parameters (often 4–25× vs LoRA) while maintaining competitive accuracy on GLUE, E2E, large-scale GPT-2, and ViT CIFAR10 benchmarks, and benefits further from quantization and intrinsic-rank masking. These results suggest Quantum-PEFT enables scalable, memory-efficient fine-tuning for billion-parameter models with practical implications for deployment and experimentation.

Abstract

This paper introduces Quantum-PEFT that leverages quantum computations for parameter-efficient fine-tuning (PEFT). Unlike other additive PEFT methods, such as low-rank adaptation (LoRA), Quantum-PEFT exploits an underlying full-rank yet surprisingly parameter efficient quantum unitary parameterization. With the use of Pauli parameterization, the number of trainable parameters grows only logarithmically with the ambient dimension, as opposed to linearly as in LoRA-based PEFT methods. Quantum-PEFT achieves vanishingly smaller number of trainable parameters than the lowest-rank LoRA as dimensions grow, enhancing parameter efficiency while maintaining a competitive performance. We apply Quantum-PEFT to several transfer learning benchmarks in language and vision, demonstrating significant advantages in parameter efficiency.

Paper Structure

This paper contains 46 sections, 5 equations, 8 figures, 15 tables.

Figures (8)

  • Figure 1: Overview of Quantum-PEFT compared to LoRA and AdaLoRA for PEFT. $W$ is the frozen pretrained weight, green boxes represent trainable parameters. LoRA updates $W$ by training the low-rank matrices $A,B$. AdaLoRA introduces the SVD trainable form $U,\Lambda,V$ with regularizer $\left\lVert {U^\top U-I} \right\rVert^2+\left\lVert {V^\top V - I} \right\rVert^2$. In Quantum-PEFT, the matrices $U,V$ are not trainable parameters, but rather computed through quantum mappings of orders-of-magnitude smaller intrinsic parameters. Contrary to AdaLoRA, $U,V$ are left-orthogonal by construction in Quantum-PEFT.
  • Figure 2: QML components. (a) Simplified two-design ansatz as \ref{['eq:qp']}. It alternates RY and CZ, i.e., a product of small unitary matrices. (b) Generalized quantum-inspired network for our unitary node. It generalizes the two-design ansatz to arbitrary dimensions by employing $\mathrm{SU}(N')$ blocks.
  • Figure 3: Quantum-PEFT modules with corresponding tensor diagrams. (a) Trainable mapping onto the Stiefel manifold $\mathcal{V}_K(N')$. Intrinsic rank $K'$: top $K'$ columns are trainable parameters in $B$. (b) Generalized CZ modules for diagonal nodes on either $\mathrm{O}(1)^{N'}$ or $\mathbb{R}^{N'}$.
  • Figure 4: Tensor diagram of LoRA variants.
  • Figure 5: Tensor diagrams of Quantum-PEFT and LoRA variants in tensor network perspectives for a matrix size of $N$ and rank $K$. The number of parameters are also present. Circle denotes dense multi-linear tensor node. Slashed open circles denote diagonal node. Half-closed circles denote unitary node. Delay symbols denote nonlinear nodes.
  • ...and 3 more figures

Theorems & Definitions (1)

  • Example 4.1