Prompt Tuning Strikes Back: Customizing Foundation Models with Low-Rank Prompt Adaptation

Abhinav Jain; Swarat Chaudhuri; Thomas Reps; Chris Jermaine

Prompt Tuning Strikes Back: Customizing Foundation Models with Low-Rank Prompt Adaptation

Abhinav Jain, Swarat Chaudhuri, Thomas Reps, Chris Jermaine

TL;DR

Low-Rank Prompt Adaptation (LoPA) is proposed, a prompt-tuning-based approach that performs on par with state-of-the-art PEFT methods and full fine-tuning while being more parameter-efficient and not requiring a server-based adapter.

Abstract

Parameter-Efficient Fine-Tuning (PEFT) has become the standard for customising Foundation Models (FMs) to user-specific downstream tasks. However, typical PEFT methods require storing multiple task-specific adapters, creating scalability issues as these adapters must be housed and run at the FM server. Traditional prompt tuning offers a potential solution by customising them through task-specific input prefixes, but it under-performs compared to other PEFT methods like LoRA. To address this gap, we propose Low-Rank Prompt Adaptation (LoPA), a prompt-tuning-based approach that performs on par with state-of-the-art PEFT methods and full fine-tuning while being more parameter-efficient and not requiring a server-based adapter. LoPA generates soft prompts by balancing between sharing task-specific information across instances and customization for each instance. It uses a low-rank decomposition of the soft-prompt component encoded for each instance to achieve parameter efficiency. We provide a comprehensive evaluation on multiple natural language understanding and code generation and understanding tasks across a wide range of foundation models with varying sizes.

Prompt Tuning Strikes Back: Customizing Foundation Models with Low-Rank Prompt Adaptation

TL;DR

Abstract

Paper Structure (14 sections, 5 equations, 8 figures, 4 tables)

This paper contains 14 sections, 5 equations, 8 figures, 4 tables.

Introduction
Related Work
Proposed Methodology
Preliminaries
Low-rank Prompt Adaptation (LoPA)
Experiments, Results, and Discussion
Experimental Setup
Baseline Comparison
Ablation Study
Conclusion
Acknowledgments
Appendix
Comparison with Parameterized Hypercomplex Multiplication (PHM) layers
Convergence Analysis of Soft-Prompting Approaches

Figures (8)

Figure 1: A schematic illustrating how typical PEFT methods like LoRA achieve personalization of a foundation model for multiple tasks, such as Yes/No text classification or code completion, during inference.
Figure 2: An illustration of LoPA. No task-specific adapters need to be stored on the server. $|$ represents the concatenation of the soft prompt $\textbf{Z}$ and the input prompt $\textbf{X}_e$ i.e. $\textbf{X}=\textrm{concat}(\textbf{Z}|\textbf{X}_e)$
Figure 3: Performance comparison of baselines as a function of $m$ on (a)-(c) GLUE benchmark and (d) CruxEval-O (with DeepseekCoder-1.3B as FM). Tunable parameters shown relative to the method with the most. Higher performance and fewer parameters indicate better results.
Figure 4: Performance of LoPA as a function of rank shown for $m=10$. (a) GLUE Benchmarks and (b) CruxEval tasks $(I, O)$ where ds-1.3 denotes DeepseekCoder-1.3B and phi-2 denotes Phi2-2.7B models. Higher performance and fewer tunable parameters indicate better results.
Figure 5: Ablation for Encoder in LoPA with DeepseekCoder-1.3B as the foundation model.
...and 3 more figures

Prompt Tuning Strikes Back: Customizing Foundation Models with Low-Rank Prompt Adaptation

TL;DR

Abstract

Prompt Tuning Strikes Back: Customizing Foundation Models with Low-Rank Prompt Adaptation

Authors

TL;DR

Abstract

Table of Contents

Figures (8)