Towards Understanding Feature Learning in Parameter Transfer

Hua Yuan; Xuran Meng; Qiufeng Wang; Shiyu Xia; Ning Xu; Xu Yang; Jing Wang; Xin Geng; Yong Rui

Towards Understanding Feature Learning in Parameter Transfer

Hua Yuan, Xuran Meng, Qiufeng Wang, Shiyu Xia, Ning Xu, Xu Yang, Jing Wang, Xin Geng, Yong Rui

TL;DR

This theory is the first to provide a dynamic analysis for parameter transfer and also the first to prove the existence of negative transfer theoretically.

Abstract

Parameter transfer is a central paradigm in transfer learning, enabling knowledge reuse across tasks and domains by sharing model parameters between upstream and downstream models. However, when only a subset of parameters from the upstream model is transferred to the downstream model, there remains a lack of theoretical understanding of the conditions under which such partial parameter reuse is beneficial and of the factors that govern its effectiveness. To address this gap, we analyze a setting in which both the upstream and downstream models are ReLU convolutional neural networks (CNNs). Within this theoretical framework, we characterize how the inherited parameters act as carriers of universal knowledge and identify key factors that amplify their beneficial impact on the target task. Furthermore, our analysis provides insight into why, in certain cases, transferring parameters can lead to lower test accuracy on the target task than training a new model from scratch. To our best knowledge, our theory is the first to provide a dynamic analysis for parameter transfer and also the first to prove the existence of negative transfer theoretically. Numerical experiments and real-world data experiments are conducted to empirically validate our theoretical findings.

Towards Understanding Feature Learning in Parameter Transfer

TL;DR

Abstract

Towards Understanding Feature Learning in Parameter Transfer

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (34)