Transfusion: Understanding Transfer Learning for Medical Imaging
Maithra Raghu, Chiyuan Zhang, Jon Kleinberg, Samy Bengio
TL;DR
This work challenges the prevailing reliance on ImageNet-pretrained models for medical imaging by benchmarking standard ImageNet architectures against lightweight CNNs on two large medical tasks. Through representational analyses and weight-transfer experiments, it shows that transfer learning provides limited performance gains and that overparameterized models are often unnecessary. SVCCA reveals that pretrained representations diverge mainly in early layers, with substantial feature reuse confined to the bottom of the network. The authors further demonstrate feature-independent benefits from weight scaling and explore hybrid transfer strategies that maintain performance while enabling faster convergence and more efficient model exploration.
Abstract
Transfer learning from natural image datasets, particularly ImageNet, using standard large models and corresponding pretrained weights has become a de-facto method for deep learning applications to medical imaging. However, there are fundamental differences in data sizes, features and task specifications between natural image classification and the target medical tasks, and there is little understanding of the effects of transfer. In this paper, we explore properties of transfer learning for medical imaging. A performance evaluation on two large scale medical imaging tasks shows that surprisingly, transfer offers little benefit to performance, and simple, lightweight models can perform comparably to ImageNet architectures. Investigating the learned representations and features, we find that some of the differences from transfer learning are due to the over-parametrization of standard models rather than sophisticated feature reuse. We isolate where useful feature reuse occurs, and outline the implications for more efficient model exploration. We also explore feature independent benefits of transfer arising from weight scalings.
