Quantum-PEFT: Ultra parameter-efficient fine-tuning
Toshiaki Koike-Akino, Francesco Tonin, Yongtao Wu, Frank Zhengqing Wu, Leyla Naz Candogan, Volkan Cevher
TL;DR
The paper tackles the rising cost of fine-tuning large pre-trained models by introducing Quantum-PEFT, a quantum-inspired, parameter-efficient fine-tuning framework. It reparameterizes weight updates as ultra-compact unitary embeddings using Pauli rotations, mapping to the Stiefel manifold and assembling larger unitaries via quantum Shannon decomposition to handle arbitrary dimensions. The approach achieves orders-of-magnitude reductions in trainable parameters (often 4–25× vs LoRA) while maintaining competitive accuracy on GLUE, E2E, large-scale GPT-2, and ViT CIFAR10 benchmarks, and benefits further from quantization and intrinsic-rank masking. These results suggest Quantum-PEFT enables scalable, memory-efficient fine-tuning for billion-parameter models with practical implications for deployment and experimentation.
Abstract
This paper introduces Quantum-PEFT that leverages quantum computations for parameter-efficient fine-tuning (PEFT). Unlike other additive PEFT methods, such as low-rank adaptation (LoRA), Quantum-PEFT exploits an underlying full-rank yet surprisingly parameter efficient quantum unitary parameterization. With the use of Pauli parameterization, the number of trainable parameters grows only logarithmically with the ambient dimension, as opposed to linearly as in LoRA-based PEFT methods. Quantum-PEFT achieves vanishingly smaller number of trainable parameters than the lowest-rank LoRA as dimensions grow, enhancing parameter efficiency while maintaining a competitive performance. We apply Quantum-PEFT to several transfer learning benchmarks in language and vision, demonstrating significant advantages in parameter efficiency.
