SliceFine: The Universal Winning-Slice Hypothesis for Pretrained Networks
Md Kowsher, Ali O. Polat, Ehsan Mohammady Ardehaly, Mehrdad Salehi, Zia Ghiasi, Prasanth Murali, Chen Chen
TL;DR
The paper addresses why tiny, randomly selected slices within pretrained networks can suffice for downstream adaptation. It introduces the Universal Winning Slice Hypothesis (UWSH), grounded in spectral balance across slice groups and high task energy in frozen backbones, to show that any sufficiently wide slice can be a local winning ticket and a small set of slices can form a global winning ticket. Building on this theory, SliceFine updates only moving slices across layers with zero additional parameters, achieving competitive accuracy against strong PEFT baselines while improving training speed, memory efficiency, and model compactness. Empirical results span language, vision, and video tasks, with ablations mapping how slice rank, switching intervals, and backbone quality influence performance, thereby offering a theoretically grounded, practical alternative to adapter- and prune-based approaches. The work bridges theory and practice, suggesting a universal slice-based pathway to parameter-efficient fine-tuning in large-scale pretrained models.
Abstract
This paper presents a theoretical framework explaining why fine tuning small, randomly selected subnetworks (slices) within pre trained models can be sufficient for downstream adaptation. We prove that pretrained networks exhibit a universal winning slice property arising from two phenomena: (1) spectral balance the eigenspectra of different weight matrix slices are remarkably similar; and (2) high task energy their backbone representations retain rich, task relevant features. This leads to the Universal Winning Slice Hypothesis, which provides a theoretical foundation for parameter efficient fine tuning (PEFT) in large scale models. Inspired by this, we propose SliceFine, a PEFT method that exploits this inherent redundancy by updating only selected slices of the original weights introducing zero new parameters, unlike adapter-based approaches. Empirically, SliceFine matches the performance of state of the art PEFT methods across language and vision tasks, while significantly improving training speed, memory efficiency, and model compactness. Our work bridges theory and practice, offering a theoretically grounded alternative to existing PEFT techniques.
