PEANuT: Parameter-Efficient Adaptation with Weight-aware Neural Tweakers

Yibo Zhong; Haoxiang Jiang; Lincan Li; Ryumei Nakada; Tianci Liu; Linjun Zhang; Huaxiu Yao; Haoyu Wang

PEANuT: Parameter-Efficient Adaptation with Weight-aware Neural Tweakers

Yibo Zhong, Haoxiang Jiang, Lincan Li, Ryumei Nakada, Tianci Liu, Linjun Zhang, Huaxiu Yao, Haoyu Wang

TL;DR

PEANuT tackles the inefficiency of full fine-tuning by introducing weight-aware neural tweakers that produce task-adaptive updates conditioned on frozen pre-trained weights. The method replaces LoRA’s linear, weight-agnostic updates with a lightweight nonlinear network f(W^0; θ), yielding updates ΔW that depend explicitly on the original weights and capture richer adaptation patterns. The authors prove under a Left-Singular-Space invariance condition that PEANuT achieves equal or greater expressivity with fewer parameters, and they demonstrate strong empirical gains across four benchmarks in NLP and vision with minimal overhead. Overall, PEANuT provides a practical, scalable PEFT alternative that improves performance while maintaining efficiency, enabling more effective deployment of large foundation models.

Abstract

Fine-tuning large pre-trained foundation models often yields excellent downstream performance but is prohibitively expensive when updating all parameters. Parameter-efficient fine-tuning (PEFT) methods such as LoRA alleviate this by introducing lightweight update modules, yet they commonly rely on weight-agnostic linear approximations, limiting their expressiveness. In this work, we propose PEANuT, a novel PEFT framework that introduces weight-aware neural tweakers, compact neural modules that generate task-adaptive updates conditioned on frozen pre-trained weights. PEANuT provides a flexible yet efficient way to capture complex update patterns without full model tuning. We theoretically show that PEANuT achieves equivalent or greater expressivity than existing linear PEFT methods with comparable or fewer parameters. Extensive experiments across four benchmarks with over twenty datasets demonstrate that PEANuT consistently outperforms strong baselines in both NLP and vision tasks, while maintaining low computational overhead.

PEANuT: Parameter-Efficient Adaptation with Weight-aware Neural Tweakers

TL;DR

Abstract

Paper Structure (31 sections, 2 theorems, 18 equations, 5 figures, 16 tables)

This paper contains 31 sections, 2 theorems, 18 equations, 5 figures, 16 tables.

Introduction
Related Work
Methodology
Preliminary
Inherent Limitation of LoRA Formulation
Parameter-Efficient Adaptation with Weight-aware Neural Tweakers
Theoretical Analysis
Complexity Analysis
Experiment
Benchmarks and Experiment Setups
Performance Comparison
Ablation Study
Runtime and Memory Cost
Sensitivity w.r.t. Depth
Sensitivity w.r.t. Activations
...and 16 more sections

Key Result

Proposition 3.2

Given pre-trained weight matrix $\mathbf{W}^{0}$. Let $\sigma$ denote ReLU activation function, and $\bm{U}^0 \in \mathbb{R}^{d_1 \times \operatorname{rank}(\mathbf{W}^0)}$ be the left singular vectors of $\mathbf{W}^{0}$. Suppose that the fine-tuning loss $\mathcal{L}$ is invariant under the the pr

Figures (5)

Figure 1: Framework of proposed PEANuT.
Figure 2: Implementation of introducing more depths to PEANuTt. We insert multiple intermediate layers into the layers from vanilla PEANuT, with non-linear activation in between. The depth is described as the number of layers in PEANuT, with vanilla PEANuT having a depth of 2 (i.e. the input and output layers).
Figure 3: Accuracy on the RTE, StanfordCars, PIQA and MATH dataset with varying depths of the neural network used in PEANuT. The depth here represents the total number of layers in the neural network. We choose depth equals to 2, 4 and 6 layers in the figure.
Figure 4: Influence of different nonlinear activations choices for PEANuT. Experiments are conducted on StanfordCars, PEANuT depth is fixed to 2. Different activations share a similar pattern of dependency on learning rate.
Figure 5: Accuracy of PEANuT with different targeted fine-tuning modules, including just QV layers and a combination of QV and MLP layers, on image classification datasets.

Theorems & Definitions (5)

Remark 3.1
Proposition 3.2
proof : Proof of Proposition \ref{['prop: equivalence in loss']}
Proposition B.1: Expressivity of PEANuT with Sine Activation
proof

PEANuT: Parameter-Efficient Adaptation with Weight-aware Neural Tweakers

TL;DR

Abstract

PEANuT: Parameter-Efficient Adaptation with Weight-aware Neural Tweakers

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (5)