Explicit Inductive Bias for Transfer Learning with Convolutional Networks

Xuhong Li; Yves Grandvalet; Franck Davoine

Explicit Inductive Bias for Transfer Learning with Convolutional Networks

Xuhong Li, Yves Grandvalet, Franck Davoine

TL;DR

The paper tackles the problem of preserving useful knowledge from a pre-trained CNN during transfer learning by introducing explicit inductive biases that anchor fine-tuning to the initial weights. It develops a family of regularizers, notably L2-SP, which penalize deviations from the pre-trained parameters, and systematically evaluates them against standard L2, L1, and Group-Lasso variants across diverse source/target pairs. Empirically, L2-SP consistently improves target-task accuracy over conventional fine-tuning, with larger gains in low-data regimes, and requires minimal computational overhead; Fisher-based variants show limited additional benefit in this context. The work proposes L2-SP as a robust, simple baseline for transfer learning and provides theoretical and empirical insights into why preserving proximity to the pre-trained solution helps retain useful source-task representations, with evidence also extending to segmentation tasks like Cityscapes.

Abstract

In inductive transfer learning, fine-tuning pre-trained convolutional networks substantially outperforms training from scratch. When using fine-tuning, the underlying assumption is that the pre-trained model extracts generic features, which are at least partially relevant for solving the target task, but would be difficult to extract from the limited amount of data available on the target task. However, besides the initialization with the pre-trained model and the early stopping, there is no mechanism in fine-tuning for retaining the features learned on the source task. In this paper, we investigate several regularization schemes that explicitly promote the similarity of the final solution with the initial model. We show the benefit of having an explicit inductive bias towards the initial model, and we eventually recommend a simple $L^2$ penalty with the pre-trained model being a reference as the baseline of penalty for transfer learning tasks.

Explicit Inductive Bias for Transfer Learning with Convolutional Networks

TL;DR

Abstract

Explicit Inductive Bias for Transfer Learning with Convolutional Networks

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (4)