Parameter-Efficient Subspace Optimization for LLM Fine-Tuning
Yuchen Lou, Zeqi Ye, Minshuo Chen
TL;DR
This work reframes parameter-efficient fine-tuning of large language models as a subspace minimization problem, introducing PESO to unify LoRA-like methods with principled optimization. It couples exploration of new subspaces guided by full gradients with exploitation inside current subspaces via SVD-based representations, enabling stronger convergence guarantees in the full-parameter space. The paper presents PESO-LoRA-R and PESO-LoRA-T as practical instantiations, achieving improved performance on GLUE, reasoning and code tasks while maintaining memory efficiency. Theoretical results show convergence to stationary points under full gradient restart, with exact convergence possible when subspaces align with full gradients, and empirical evidence across NLP benchmarks supports the method's effectiveness and robustness.
Abstract
This paper develops a new perspective on parameter-efficient fine-tuning for LLMs, inspired by the classical theory of subspace minimization. We introduce a unifying framework, Parameter-Efficient Subspace Optimization (PESO), which not only recovers many existing methods such as LoRA but also bridges them with the principled algorithmic and theoretical foundations of subspace optimization. This connection highlights a natural ``exploration--exploitation'' view of subspace methods, guiding the design of new algorithms that achieve strong convergence performance while still preserving memory efficiency. Importantly, our framework establishes the convergence in the full-parameter space, resolving a critical gap of LoRA variants where low-rank updates lack such guarantees. We further instantiate the framework into a practical algorithm named {PESO-LoRA}, based on LoRA-type parameterization. Our algorithm achieves notable improvements over existing methods on standard benchmarks.
