Propulsion: Steering LLM with Tiny Fine-Tuning
Md Kowsher, Nusrat Jahan Prottasha, Prakash Bhat
TL;DR
Propulsion presents a parameter-efficient fine-tuning method that freezes pre-trained weights and learns a per-layer diagonal Propulsion matrix to re-scale layer outputs, effectively steering model behavior with a fraction of trainable parameters. Grounded by Neural Tangent Kernel (NTK) theory, Propulsion shows that updating a diagonal subset of parameters can closely approximate full fine-tuning, with formal NTK bounds and empirical validation. Across GLUE, SQuAD, summarization, and multiple large language models, Propulsion delivers competitive or superior performance while dramatically reducing parameter count and training resources compared to established PEFT methods. The approach yields faster convergence and lower memory usage, offering a practical, scalable path for task-specific adaptation of large transformers, albeit with some limitations on the granularity of control and dependence on pre-trained model quality.
Abstract
The rapid advancements in Large Language Models (LLMs) have revolutionized natural language processing (NLP) and related fields. However, fine-tuning these models for specific tasks remains computationally expensive and risks degrading pre-learned features. To address these challenges, we propose Propulsion, a novel parameter efficient fine-tuning (PEFT) method designed to optimize task-specific performance while drastically reducing computational overhead. Inspired by the concept of controlled adjustments in physical motion, Propulsion selectively re-scales specific dimensions of a pre-trained model, guiding output predictions toward task objectives without modifying the model's parameters. By introducing lightweight, trainable Propulsion parameters at the pre-trained layer, we minimize the number of parameters updated during fine-tuning, preventing overfitting or overwriting of existing knowledge. Our theoretical analysis, supported by Neural Tangent Kernel (NTK) theory, shows that Propulsion approximates the performance of full fine-tuning with far fewer trainable parameters. Empirically, Propulsion reduces the parameter count from 355.3 million to just 0.086 million, achieving over a 10x reduction compared to standard approaches like LoRA while maintaining competitive performance across benchmarks.
