Table of Contents
Fetching ...

Neural Parameter Search for Slimmer Fine-Tuned Models and Better Transfer

Guodong Du, Zitao Fang, Jing Li, Junlin Li, Runhua Jiang, Shuyang Yu, Yifei Guo, Yangneng Chen, Sim Kuan Goh, Ho-Kin Tang, Daojing He, Honghai Liu, Min Zhang

TL;DR

This work tackles the redundancy and generalization gap of fine-tuned checkpoints by pruning task-vector deltas with Neural Parameter Search (NPS). NPS decomposes the fine-tuning delta $\tau$ into low-rank subspaces and optimizes weights $\{w_m\}$ via CMA-ES to form a pruned update, yielding $\hat{\theta}_{ft} = \theta_{pre} + \sum_m w_m q_m$ before applying a sparsity mask. The authors demonstrate three practical applications—knowledge transfer, knowledge fusion, and knowledge compression—showing that NPS can mitigate catastrophic forgetting, enable robust multi-task fusion, and substantially reduce storage while preserving near-original performance across vision, NLP, and multimodal benchmarks. The approach delivers gradient-free, lightweight pruning with broad applicability, achieving notable gains in accuracy and compression (e.g., up to $+3.0\%$ in vision fusion and nearly 99% normalized accuracy in compression), making it valuable for scalable deployment of fine-tuned models. Overall, NPS provides a simple yet effective tool for slimming fine-tuned models while preserving transferability and resilience across tasks and domains.

Abstract

Foundation models and their checkpoints have significantly advanced deep learning, boosting performance across various applications. However, fine-tuned models often struggle outside their specific domains and exhibit considerable redundancy. Recent studies suggest that combining a pruned fine-tuned model with the original pre-trained model can mitigate forgetting, reduce interference when merging model parameters across tasks, and improve compression efficiency. In this context, developing an effective pruning strategy for fine-tuned models is crucial. Leveraging the advantages of the task vector mechanism, we preprocess fine-tuned models by calculating the differences between them and the original model. Recognizing that different task vector subspaces contribute variably to model performance, we introduce a novel method called Neural Parameter Search (NPS-Pruning) for slimming down fine-tuned models. This method enhances pruning efficiency by searching through neural parameters of task vectors within low-rank subspaces. Our method has three key applications: enhancing knowledge transfer through pairwise model interpolation, facilitating effective knowledge fusion via model merging, and enabling the deployment of compressed models that retain near-original performance while significantly reducing storage costs. Extensive experiments across vision, NLP, and multi-modal benchmarks demonstrate the effectiveness and robustness of our approach, resulting in substantial performance gains. The code is publicly available at: https://github.com/duguodong7/NPS-Pruning.

Neural Parameter Search for Slimmer Fine-Tuned Models and Better Transfer

TL;DR

This work tackles the redundancy and generalization gap of fine-tuned checkpoints by pruning task-vector deltas with Neural Parameter Search (NPS). NPS decomposes the fine-tuning delta into low-rank subspaces and optimizes weights via CMA-ES to form a pruned update, yielding before applying a sparsity mask. The authors demonstrate three practical applications—knowledge transfer, knowledge fusion, and knowledge compression—showing that NPS can mitigate catastrophic forgetting, enable robust multi-task fusion, and substantially reduce storage while preserving near-original performance across vision, NLP, and multimodal benchmarks. The approach delivers gradient-free, lightweight pruning with broad applicability, achieving notable gains in accuracy and compression (e.g., up to in vision fusion and nearly 99% normalized accuracy in compression), making it valuable for scalable deployment of fine-tuned models. Overall, NPS provides a simple yet effective tool for slimming fine-tuned models while preserving transferability and resilience across tasks and domains.

Abstract

Foundation models and their checkpoints have significantly advanced deep learning, boosting performance across various applications. However, fine-tuned models often struggle outside their specific domains and exhibit considerable redundancy. Recent studies suggest that combining a pruned fine-tuned model with the original pre-trained model can mitigate forgetting, reduce interference when merging model parameters across tasks, and improve compression efficiency. In this context, developing an effective pruning strategy for fine-tuned models is crucial. Leveraging the advantages of the task vector mechanism, we preprocess fine-tuned models by calculating the differences between them and the original model. Recognizing that different task vector subspaces contribute variably to model performance, we introduce a novel method called Neural Parameter Search (NPS-Pruning) for slimming down fine-tuned models. This method enhances pruning efficiency by searching through neural parameters of task vectors within low-rank subspaces. Our method has three key applications: enhancing knowledge transfer through pairwise model interpolation, facilitating effective knowledge fusion via model merging, and enabling the deployment of compressed models that retain near-original performance while significantly reducing storage costs. Extensive experiments across vision, NLP, and multi-modal benchmarks demonstrate the effectiveness and robustness of our approach, resulting in substantial performance gains. The code is publicly available at: https://github.com/duguodong7/NPS-Pruning.

Paper Structure

This paper contains 48 sections, 12 equations, 8 figures, 14 tables.

Figures (8)

  • Figure 1: Knowledge transfer, fusion, and compression are enhanced with the assistance of pre-trained model parameters. The fine-tuned model is effectively represented as a combination of the pre-trained model and pruned task vectors, leading to knowledge retention.
  • Figure 2: Performance of ViT-B/32 models on a specific task (SUN397 dataset). Different subspaces of neural parameters within the task vector contribute differently to the performance of the fine-tuned model.
  • Figure 3: The framework of Neural Parameter Search enhances the efficiency of pruning fine-tuned models. This is achieved by searching and reweighting the neural parameters of task vectors within low-rank subspaces.
  • Figure 4: Performance variations of different methods with changes in sparsity ratio. Our NPS method exhibits higher tolerance to varying levels of sparsity.
  • Figure 5: Averaged normalized accuracy and storage cost versus the number of tasks on computer vision benchmarks. Our proposed NPS method consistently preserves initial performance across various task combinations while significantly compressing the fine-tuned checkpoints.
  • ...and 3 more figures